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In this Issue 

Computer programs for data base management usually use magnetic disc 
as their primary data storage medium and enforce rigid protocols to guarantee 
data consistency and integrity. While these are highly desirable features for 
most applications, they are not without cost. If many transactions occur in 
a short time, the system's response to an individual transaction may be slow, 
or even worse, unpredictable. This isn't acceptable for real-time applica- 
tions — high-speed production lines, for example. HP Real-Time Data Base, 
a data base management system for real-time applications running on HP 
9000 Series 300 and 800 Computers, is designed for predictable response 
time and high speed. It's a memory-resident system, using main memory as its primary data 
storage medium, and it allows the user to disable unnecessary features to increase performance. 
Tests have shown that using direct access (one of three methods), HP Real-Time Data Base can 
retrieve data at a rate of 66,666 56-byte records per second. The article on page 6 describes the 
design of this system and tells how choosing the right design alternatives led to its high perfor- 
mance. 

As long as you know how they were measured, MIPS (millions of instructions per second) and 
MFLOPS (millions of floating-point operations per second) can be useful measures of the relative 
performance of computer system processing units (SPUs). The new SPU for HP 9000 Model 835 
technical computers and HP 3000 Series 935 commercial computers has been tested at 14 MIPS 
and 2.02 MFLOPS running particular benchmark programs (see the footnote on page 19). This 
represents more than a 300% increase in floating-point performance and more than a 50% 
increase in integer performance over this SPU s predecessor, the Model 825'Series 925 SPU. 
Responsible for these increases are processor design improvements and a new floating-point 
coprocessor, as explained in the article on page 18. A new 16M-byte memory board was also 
designed and is manufactured using an advanced double-sided surface mount process, described 
on page 23. 

Half-inch reel-to-reel tape drives are widely used for backing up large disc memories in computer 
systems. Desirable characteristics are high speed for minimum backup time and large reel capacity 
so that fewer reels have to be handled and stored. The HP 7890XC Tape Drive uses a sophisticated 
data compression scheme to increase reel capacity, as explained in the article on page 26. It 
also uses a complementary technique called super-blocking to deal with certain features of its 
industry-standard 6250 GCR tape format that tend to limit the capacity improvement possible with 
data compression alone. Super-blocking is explained in the article on page 32. Using both data 
compression and super-blocking, the HP 7980XC has achieved capacity improvements of 2.5 to 
5 times, depending on the data. 

High-speed fiber optic communications systems are made up of four basic types of components. 
For example, there are amplifiers, which have electrical inputs and electrical outputs, laser diodes, 
which have electrical inputs and optical (light) outputs, photodiodes. which have optical inputs 
and electrical outputs, and optical fibers, which have optical inputs and optical outputs. Accurate 
measurements of the transmission and reflection characteristics of all of these device types, 
needed by both component designers and system designers, are provided by HP 8702A Lightwave 
Component Analyzer systems. Each system consists of a lightwave source, a lightwave receiver, 
the HP 8702A analyzer, and for reflection measurements, a lightwave coupler. In the article on 
page 35, you'll find a description of these systems and a comprehensive treatment of their 
applications and performance. The design of the lightwave sources and receivers is presented 
in the article on page 52. A comparison of the reflection measurement capabilities of the HP 
8702A and the HP 8145A Optical Time-Domain Reflectometer (December 1988) appears on 
page 43. 
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Videoscope. the subiect of the article on page 58. is a combination of hardware and software 
that automates the testing of application software for HP Vectra Personal Computers. While a 
test is being run manually. Videoscope records the human tester's keystrokes and mouse move- 
ments, and with the human tester's approval, the correct responses of the application being tested 
It can then rerun the test automatically. Unlike other similar testers. Videoscope doesn't affect 
the performance or behavior of the application being tested. The key to this difference is the 
hardware, a plug-in card that nonintrusively monitors the video signal of the system running the 
application being tested and. for each screen, develops a single-number representation called a 
signature. Signature analysis isn't new, having been used for many years for troubleshooting 
digital hardware, but its adaptation to software testing is an ingenious and elegant solution to the 
problem of capturing screens. (Videoscope is an in-house HP tool, not a product.) 

A lot of research has been based on the conjecture that if we could simulate the human brain's 
basic elements — neurons — on a computer, we could connect a bunch of them in a network, and 
we might be able to solve some of the problems that regular computers find difficult but the brain 
handles with ease. This approach has met with some success, particularly with certain optimization 
problems. The theory of neural networks is expressed in differential equations, and its application 
to practical problems is not intuitive. Seeking and not finding a simpler, higher-level method of 
determining the right neuron interconnections, gains, and component values to solve a given 
problem. Barry Shackleford of HP Laboratories developed one. In the paper on page 69. he 
explains his approach and applies it to several classic optimization problems such as the traveling 
salesman problem and the eight queens problem. 

While we usually think of metal as something very stable, engineers and physicists who deal 
with integrated circuit chips know that a high enough current density in a thin metal film will cause 
the metal atoms to move. Over long periods of time, the metal piles up in some places and leaves 
holes in other places, causing chip failures. Although electromigration has been studied extensively, 
we still don't have a complete mathematical theory for it. The paper on page 79 reports on a new 
two-dimensional mathematical model that makes it possible to simulate electromigration with good 
accuracy on a computer using exclusively classical physics, not quantum mechanics. The model 
was developed jointly by scientists at HP Laboratories and the California State University at San 
Jose. 

R.P. Dolan 
Editor 

Cover 

One of the potential application areas for the HP Real-Time Data Base is in computer integrated 
manufacturing, where data such as the status of each station on a manufacturing line can be 
monitored in real time for quality control purposes. The picture shows a veterinary bolus (large 
pill) assembly line at the ALZA Corporation in Palo Alto, California. ALZA Corporation researches, 
develops, and manufactures, and markets drug delivery systems. ALZA Director of Quality Assur- 
ance Carol L. Hartstein is shown in the inset photo with a simulated monitor screen. Our thanks 
to ALZA Corporation for helping us illustrate this application. 



What's Ahead 

In the August issue we'll bring you the designers' view of the HP NewWave environment, HP's 
state-of-the art user interface for personal computers. The evolution of an existing quarter-inch 
tape drive into the HP 9145A with twice the speed and twice the cartridge capacity will also be 
featured. 
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A Data Base for Real-Time Applications 
and Environments 

HP Real-Time Data Base is a set of subroutines and a query 
facility that enable real-time application developers to build 
and access a real-time, high-performance, memory- 
resident data management system. The software runs in 
an HP-UX environment on an HP 9000 Series 300 or 800 
Computer. 

by Feyzi Fatehi, Cynthia Givens, Le T. Hong, Michael R. Light, Ching-Chao Liu, and Michael J. Wright 



A REAL-TIME ENVIRONMENT deals with current 
phenomena rather than past or future events. If infor- 
mation is lost, it is lost forever since there is rarely 
an opportunity to reclaim it. A typical real-time situation 
is a factory floor where a computer is monitoring the status 
of machines and materials and constantly checking to make 
sure that everything is working properly. Frequently, the 
data from these checks can be discarded once it is deter- 
mined that all is indeed satisfactory, although some data 
might he retained for statistical purposes. If the checking 
process reveals something amiss, a real-time process might 
be invoked to correct the situation, such as rejecting a 
flawed part or shutting down an entire assembly line if a 
machine is overheating. Data from such incidents is fre- 
quently saved for later analysis (e.g.. statistical quality con- 
trol). 

A real-time environment needs to respond reliably when 
an action must be taken quickly within a brief predeter- 
mined span of time, such as receiving and storing a satellite 
data transmission. If the process of receiving and storing 
the data can always be expected to finish within 80 milli- 
seconds, then the satellite can reasonably transmit every 
100 milliseconds without fear of losing any data. 

Data capture in a real-time environment may involve 
sampling large amounts of raw information with data arriv- 
ing unexpectedly in bursts of thousands of bytes or even 
megabyte quantities. A real-time data base must be capable 
of efficiently storing such large amounts of data and still 
support the expectations of the user for reliable and predict- 
able response. 

When a real-time process requests data, it should be given 
that data immediately, without any unreasonable delay. 
Whether or not the data is consistent may be less of a 
concern than that it is the most current data available. 
Given sufficient urgency, a real-time application may not 
require guarantees of either consistency or integrity of data. 
An application designer must be aware of the risks and 
should only violate normal data integrity rules when abso- 
lutely necessary. A real-time data management system must 
tolerate such violations when they are clearly intentional. 

Finally, a real-time data base must be scalable to the 
needs of different users. This means that users should be 
able to implement or eliminate functionality according to 



the needs of the application. The performance impact of 
unused functionality must be minimal. 

Traditional Data Bases 

Traditional data bases are generic and flexible, intended 
to support the widest possible range of applications. Most 
traditional data bases use magnetic disc as the primary data 
storage medium because of its large capacity, relatively 
high-speed access, and data permanence. Disc-based data 
bases in the gigabyte range are now possible. 

However, traditional data bases are too slow for most 
real-time applications. Disc access speeds are still two to 
three orders of magnitude slower than dynamic random 
access memory (DRAM) access. Even when the average 
speed of a traditional data base is acceptable, its worst-case 
speed may be totally unacceptable. A critical need of real- 
time systems is the ability to provide a predictable response 
time. Traditional data bases support transaction operations, 
which may require commit protocols, logging and recovery 
operations, and access to disc. They also define data access 
methods that rigidly enforce internal rules of data consis- 
tency and integrity. Given a large number of simultaneous 
transactions, it becomes nearly impossible to guarantee pre- 



User 



Query C Application 

Programmatic Calls 
(HP RTDB Routines) 



▼ 



Real-Time 
Data Base 




Fig. 1 . An overview ol the HP Real- Time Data Base System. 
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dictable response time. For example, data that is modified 
as part of an update transaction may not be available to a 
reader process until the entire transaction is committed. If 
the reader in this case were a real-time assembly line con- 
trol process, it could be disastrously delayed waiting for a 
process of much less importance to complete. 

Real-Time Data Bases 

Because of extremely high performance requirements, 
real-time data bases are often custom-designed to the par- 
ticular needs of a given application. This limits their usa- 
bility for other applications and causes portability prob- 
lems if their performance relies upon any hardware charac- 
teristics. They are also usually expensive to program and 
maintain. 

Real-time data bases have taken two common approaches, 
acting either as a very fast cache for disc-based data bases 
or as a strictly memory-resident system which may period- 
ically post core images to disc. Real-time data bases acting 
as a high-speed cache are capable of quickly accessing only 
a small percentage of the total data kept on disc, and the 
data capacities of purely memory-resident data bases are 
severely limited by the amount of available real memory. 
In either case, real-time data bases must coexist with disc- 
based data bases to provide archival and historical analysis 
functions in real-time applications. Eventually, a portion 
of real-time data is uploaded to a disc-based data base. 

Data transfer between real-time and disc-based data bases 
requires commonly understood data types, and may require 
reformatting or other treatment to make it digestible to the 
target data base. Frequently, data is Iransferred over a net- 
work interface as well. The problems of interfacing real- 
time data bases with disc-based data bases are often further 
complicated by the customized, nonstandard nature of 
most real-time data bases. 

HP Real-Time Data Base 

HP Real-Time Data Base (HP RTDB) is one of the Indus- 
trial Precision Tools from HP's Industrial Applications 
Center. The Industrial Precision Tools are software tools 
intended to assist computer integrated manufacturing 
(CIM) application developers by providing standard soft- 
ware solutions for industrial and manufacturing applica- 
tions problems. The HP Real-Time Data Base is the data 
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Fig. 2. A table structure m HP RTDB consisting ol six tuples 
and five columns. 

base tool. HP RTDB is a set of software routines and interac- 
tive query commands for creating and accessing a high-per- 
formance real-lime data base. It is designed for the specific 
needs of real-time systems running on HP 9000 Series 300 
and 800 Computers. 

Fig. 1 shows an overview of the HP Real-Time Data Base 
system. Access to the data base for the user is through the 
query commands or an application program written in C 
that uses the HP RTDB routines. The HP RTDB routines 
provide the application developer with the ability to: 

■ Define or change the data base schema 

■ Build the data base in memory 

■ Read or write data from or to the data base 

■ Back up the schema and data. 

The query commands provide an interactive facility for 
configuring and debugging the data base, and for processing 
scripts in batch mode and on-line without writing a pro- 
gram. The configuration file is automatically created when 
the user defines the data base. It contains the system tables 
and control .structures for the data base. 

Besides the two interfaces to the data base. HP RTDB 
also provides the following features: 

■ Performance. HP RTDB supports predictable response 
lime, and to ensure speed. HP RTDB is entirely memory- 
resident. Several design alternatives were chosen to en- 
sure this high performance, such as prea I location of all 
data base memory to minimize memory management 
overhead, alignment of data on machine word bound- 
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aries. simple data structures, and the extensive use of 
in-line macros to reduce (he overhead of function calls. 
The design alternatives chosen produced performance 
results that exceed initial goals. For example, perfor- 
mance tests showed that 66.666 56-byte records can he 
directly retrieved from an HP Real-Time data base in 
one second. 

■ Multiple Data Base Access. HP RTDB resides in shared 
memory so that multiple processes can access the data 
base concurrently. Also, one process can access multiple 
data bases. 

■ Simple Data Structures. Data is stored in HP RTDB in 
two forms: tables and input areas. A table is an array of 
columns and rows (tuples) that contain related informa- 
tion (see Fig. 2). Input areas are areas in the data base 
designed to receive large blocks of unstructured data. 
Often this data comes from high-speed data acquisition 
devices. 

■ Data Access. Retrieval routines are constructed so that 
any specified data can be accessed directly, sequentially, 
or by hash key values. Direct access is available to a 
tuple (row) of a table and to an offset in an input area. 

■ Dynamic Reconfiguration. Tables and input areas can be 
added or deleted quickly without having to recreate the 
data base. 

■ Security. HP RTDB provides three levels of password 
protection: data base administrator access, read-write ac- 
cess, and read-only access. 

■ Backup and Recovery. The schema (data base structure), 
and optionally the user's entire data base can be saved 
to a disc file. 

■ Locking. HP RTDB provides tables and input area lock- 
ing. This means that an application can exclusively ac- 
cess a table or input area until it decides to release the 
lock. If the application requires, read-through and/or 
write-through locks are allowed. 

■ Scalability. HP RTDB is scalable. If some features are 
not required they can be eliminated to improve perfor- 
mance. 

■ Documentation Aids. HP RTDB is supplied with a self- 
paced tutorial complete with a query/debug script to 
build the data base that is used in the tutorial examples. 
There is also an on-line help facility for the interactive 
query/debug utility. 

■ Programming Aids. HP RTDB programming aids include: 

□ A standard C header file defining the constants and 
data structures required to use the HP RTDB subroutines 

o Prototyping and debugging capabilities of the query/ 
debug utility 

□ On-line access to explanations of HP RTDB error codes 
n User-configurable error messages, formats, and hooks 

for user-written error routines 

□ Native language support which includes 8-bit data, 
8-bit filenames, and message catalogs. 

HP RTDB Modules 

The HP Real Time Data Base modules can be grouped 
into two main categories: user-callable routines and inter- 
nal data base routines (see Fig. 3), The user-callable 
routines include the following functions. 

■ Administrative Functions 



u Define the data base including its name, passwords, 
and system limits (MdDefDB) 

□ Build or rebuild the data base in memory (MdBuiidDbl 

□ Remove a data base from memory (MdRmDbl 

□ Change data base system limits or passwords 
IMdChgDb. MdChgPwd). 

Data Definition Functions 

□ Define a table or input area (MdDefTbl. MdDellA) 

□ Define or add column(s) to a user table (MdDe(Col) 
O Define an index on column(s) in a defined table 

(MdDellx) 

n Remove a table or an input area (MdRmTbl. MdRmlA) 

□ Remove an index from a table (MdRmlx). 
Session Begin or End Functions 

□ Open the data base and initiate a session (MdOpenDb) 

□ Close the data base and terminate a session I MdCloseDb). 
Data Manipulation Functions 

□ Open a table or input area for access (MdOpenTbi. 
MdOpenIA) 

Q Get a tuple by sequential search (MdGetTplSeq), hash 
key index (MdGetTpilx), or tuple identifier (MdGetTplDir) 

□ Compare a tuple value with a set ol expressions 
(MdCompare) 

□ Add or remove a tuple to or from a table (MdPutTpl, 
MdRmTpl) 

□ Update a tuple (MdUpdTpll 

□ Get or put a value from or to an input area (MdGetlA. 
MdPutIA) 

□ Lock or unlock a table or an input area (MdLock, 
MdUnlock). 

Utility Functions 

□ Save the data base schema and optionally the entire 
data base to disc (MdTakelmage) 

□ Release resources held by prematurely terminated 
processes (MdCleanup) 

□ Provide information on the columns of a table 
(MdCollnfo) 
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□ Provide information on the minimum data base and 
schema size in bytes (MdDb&zelnfo) 

□ Provide information on all or one specific user table, 
index on a table, or input area (MdTbiinfo. Mdlxinfo. 
MdlAlntol 

0 Provide information on system tables and session use 
(MdSchlnfo). 

The internal data base routines are used by either the 
user-callable routines or other internal routines. They are 
implemented as C functions or macros. The macro im- 
plementations are used for small pieces of code. The result- 
ing code is slightly larger but faster. The functions per- 
formed by the internal data base routines include: 

■ System Table Manager. These routines handle the tables 

1 hut define the schema and configuration of the data base. 

■ Index manager. These routines handle hashing, index 
manipulation, and formulation of hash index key values. 

■ Concurrency Manager. These routines handle locking 
operations and control concurrent processes using the 
data base. 

■ Storage Manager. These routines handle memory man- 
agement, which includes keeping track of allocated and 
available shared memory. 

■ Operating System Interface. These routines provide a 
clean and consistent interface to the HP-UX operating 
system. 

■ Table and Tuple Manager. These routines handle func- 
tions related to tuples and tables such as copying, adding, 
or deleting tuple values. 

Data Structures 

The data structures in HP RTDB are divided into two 
categories: those that manage and control access to the data 
base and define the schema, and those that contain the 
user data. Fig. 4 shows an overview of these structures in 
shared memory. The data structures in the schema and 
control section arc automatically created when the data 
base is defined. The data structures in the user area art- 
added later when the data base is built. Only two of these 
data structures are visible and accessible to the user — user 
tables and input areas. 

■ Main Control Block. The main control block contains 
the data base status, limits, and pointers to other data 
structures in the schema and control section of the data 
base. It also contains information used by the storage 
manager, such as pointers to the beginning and end of 
free memory storage space, a pointer to the list of free 
storage blocks, and the total amount of free storage left. 

■ Session Control Blocks. A session control block is allo- 
cated to each process accessing the data base. Each block 
contains a session identifier, a pointer to the main control 
block, and other information about the process, such as 
the user identifier (HP-UX uid] and the process identifier 
(HP-UX pid). The session identifier is returned to the user 
when the data base is opened, and is used in subsequent 
calls to access the data base. The number of session 
blocks determines the number of users that can have 
access to the same data base at any one time. This number 
is determined when the data base is created. 

■ Semaphore Control Blocks. There is a semaphore control 
block for each lockable object in the data base (i.e., user 



tables and input areas). These blocks contain HP-UX 
semaphore identifiers. 

■ Locks-Held Table. Each entry' in the locks-held table in- 
dicates whether a lock is being held by a session on a 
certain data base object (user table or input area), and if 
so. what type of lock. 

■ Index Tables. Index tables contain the data for perform- 
ing fast access (i.e.. hash indexing) to system and user 
tables. 

■ System Tables. System tables contain the schema (struc- 
ture) of the data base and information about the locations 
and attributes of all data base objects, including them- 
selves. 

■ User Tables and Input Areas. The application data man- 
aged by the user is contained in the user tables and input 
areas. 

Tables. The table, which is a two-dimensional array con- 
sisting of rows (tuplesl and columns, is the fundamental 
data structure in HP RTDB. There are three types of tables: 
system tables, user tables, and index tables. All tables, 
whether they are system, index, or user tables, have the 
same structure, called a table block (see Fig. 5). A table 
block is divided into two sections: control structures and 
data. Control structures contain the information needed to 
locate, add. or delete data in a table. The data portion of 
the table contains the system or user data. The information 
in the control structures includes: 

■ Table Block Header. The header contains information 
needed to access information within the table, such as 
data offsets and table type (i.e.. system, index, or user). 

■ Slot Array. Each entry in the slot array indicates whether 
a tuple in a table is filled or vacant. The slot array is ac- 
cessed when adding or deleting tuples, and when search- 
ing sequentially. 

■ Column Descriptor Array. The entries in the column 
descriptor array describe the columns in the data portion 
of the table block. Each column descriptor defines the 
column type (i.e., character, byte string, integer, float, 
input area offset, etc.). the column length, and the col- 
umn offset in bytes from the start of the tuple (see Fig. 6). 
The data in each type of table is stored in tuples. The 

tuple format, which is the number, length, and type of 
columns, must be the same for all tuples in any one table. 
However, the tuple format may be different for each table. 
The number and size of tuples in a table are limited only 
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by the amount of real memory available. Each tuple and 
all columns within a tuple are word-aligned. Variable- 
length columns and null columns are not supported. To 
support only fixed-length data and columns may seem 
wasteful of real memory, but this scheme more than offsets 
the increased size and complexity of code needed to sup- 
port variable-length data, and the resulting performance 
degradation. Another benefit is that the size of a table has 
little effect upon the speed of accessing any given tuple. 
Since all tuples in a table are the same length, a tuple's 
location is fixed and can be quickly determined with one 
simple calculation. Once located, copying the tuple's data 
between the data base and a user buffer can be done by 
words (four bytes at a time) rather than one byte at a time, 
since all data is aligned on machine word boundaries. 

Data in user tables can be any supported C data type or 
an offset into an input area. Users can also store and retrieve 
unsupported data types in table columns defined as a byte 
string type. Using the byte string type, the user can store 
pointers to other tuples in the same or any other table. Data 
compression, alignment of values in tuples, and verifica- 
tion of data types is left to the user's application, where 
these functions can be done more efficiently. HP RTDB 
routines store user data exactly as it is received and retrieve 
user data exactly as it is stored. A positive side effect of 
this is that the storage and retrieval integrity of IB-bit data 
(e.g., Katakana or Chinese text) can be guaranteed without 
special routines. 

Because all table types have the same table block struc- 
ture, the same code can be used to perform operations on 
system, index, and user tables. However, system table ac- 
cess is so critical to performance that operations on system 
tables are often performed by special code that takes full 
advantage of the known, fixed locations and formats of 
system tables. 

Tuple Identifiers. A tuple identifier or tid uniquely iden- 
tifies each tuple in every table in the data base including 
system tables, index tables, and user tables. Tuple iden- 
tifiers are used by the user to access user tables and by 
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Fig. 7. Tuple identifier data structure and its relationship to 
system tables and user tables. 

internal HP RTDB routines to access all the tables in the 
data base. A tuple identifier is returned to the user when 
a tuple is added to a table (MdPuiTpl) or after a successful 
table search (MdGetTplSeq) or after a successful indexed ac- 
cess (MdGetTplIx). Once obtained, a tuple identifier can be 
used in subsequent calls to provide extremely fast, direct 
access to the same tuple for rereading, deletion, or update. 
Directed access by tuple identifier is by far the fastest access 
method in the HP RTDB data base. 

The data type for a tuple identifier is called tidtype and 
contains three elements: a table number, a tuple number, 
and a version number. 

■ The table number is the tuple number for a tuple in a 
system table that describes the table associated with the 
tuple identifier. Fig. 7 shows the tid for a user table and 
the use of the table number and tuple number entries. 
For system and user tables, the system table containing 
the tuples of table descriptions is called a table system 
table, and lor index tables the system table is called an 
index system table. System tables are described in detail 
later in this article. 
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■ The tuple number indicates the row in a table containing 
the tuple data. 

■ The version number is used to ensure that a tuple being 
accessed directly by a lid is the same tuple that was ac- 
cessed when the tid was first obtained. For example, sup- 
pose user A adds a tuple to table X and saves the returned 
tit) for subsequent rereading. If user B accesses table X 
and deletes the tuple added by user A and then adds 
another tuple to table X. it is possible that the tuple 
added by user B could occupy the same location as the 
tuple originally added by user A. When user A attempts 
to use the same bd on table X for reading the tuple that 
was changed by user B, the version numbers won't match 
and user A will be prevented from accessing the tuple 
and notified of the problem. 

Tuple identifiers can be used to chain tuples logically. 
Users can build logical relationships between tuples by 
inserting the tuple identifier of one tuple as a column value 
of another tuple. This concept is illustrated in Fig. 8. where 
tidA4 and tidB2 are tuple identifiers. The tuple identifier is 
designed so that its value remains constant across data base 
restarts and dynamic schema changes. Thus relationships 
"I tuples, whether system-defined or user-defined, are not 
lost when a data base is shut down and restarted. 
System and User Tables. The system tables contain the 
schema for the data base, and the user tables and input 
areas contain the user's application data. The relationship 
between these data structures is shown in Fig. 9. There are 
lour types of system tables: 

■ Table System Table. The table system table contains in- 
formation on all the user tables and system tables in the 
data base including itself. One section of the table de- 
scribes system tables and another section describes user 



tables. Each tuple in the table system table describes one 
table, and the culumns contain relevant information 
about the attributes of the table described by the tuple 
(e.g.. table name, tuple length, number of columns, and 
so on). Fig. 10 shows a portion of a tuple in the table 
svslem table for a user table (Usertbl02). The entry CSTtid 
is the tuple identifier for the starting tuple in the column 
system table assigned to Usertbi02. and the entry iSTtid is 
the tuple indentifier for the starting tuple in the index 
system table assigned to UsertbK)2. The entry FstBlkOff is 
an offset in bytes to the first block of storage for Usertt>i02. 
When the user adds or deletes a table, the table system 
table is updated accordingly. Likewise, when certain 
changes (e.g.. add indexes) are made to the user table 
these changes are reflected in the associated tuple in the 
table system table. 

■ Column System Table. The column system table contains 
information on all culumns in a user table. Each tuple 
describes one column in a user table. Some of the infor- 
mation kept in the column system table includes column 
type, length, and offset for each user table column. This 
same information is kept in the the column descriptor 
array of the user table control block described earlier. 
The reason for having this data in two places is that it 
eliminates one level of indirection when accessing data 
in user table columns. A new tuple is added to the col- 
umn system table when a new column is added to a user 
table. 

■ Index System Table. The index system table contains 
information on the indexes for system and user tables. 
Each tuple in the index system table describes an index 
defined on a system or user table. Indexes on system 
tables are predefined by HP RTDB and indexes on user 
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tables are defined only by the user. Indexes are described 
in more detail later in this article. 

■ Input Area System Table. The input area system table 
contains information on user-defined input areas. Each 
tuple contains the input area name, the input area size, 
and the offset (in bytes) of the beginning storage location 
allocated to the input area. 

Indexes. Indexes are defined on tables to provide faster 
access to a data tuple. HP RTDB provides hash indexing. 
Fig. 1 1 shows the organization of the hash indexing scheme 
employed in HP RTDB. In this scheme a key value, which 
is composed of one or more columns in a table, is sent to 
a hash function that computes a pointer into a table of 
tuple identifiers. Once the tuple identifier is known, the 
desired tuple can be accessed. 

The columns that are used for the key value are desig- 
nated in the index system table described ealier. Fig. 12 
shows the relationship between the index system table and 
the columns in a user table designated for key values. These 
columns are specified when an index is defined for a table. 
In many hashing schemes the hashing function transforms 
a key value into a storage address where the user's data is 
stored. HP RTDB does not use hashing for both storage and 
retrieval of tuples, but only as a very fast retrieval mecha- 
nism. 

The process of inserting a new tuple into a table with a 
hash index takes the following steps: 

■ The tuple is inserted in the first available slot in the user 
table without regard to any index defined on the table. 

■ A location is found in the index table by applying the 
hash function to the key value of the tuple. This location 
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Fig. 10. A partial view ol a tuple in a table system table that 
describes a user table named Usrtbi02, and the connection to 
other system tables that contain information about Usrtbi02. 



is called the primary location for the tuple. If the hash 
function returns a primary location that is already in 
use, a secondary location is found and linked to the 
primary location using a synonym chain. 

■ The tuple identifier of the inserted tuple is stored in the 
designated primary (or secondary) location in the index 
table. 

The process of retrieving an existing tuple from a table 
using the tuple's key value takes the fullowiug steps: 

■ The hash function is applied to the key value to obtain 
the primary location for the corresponding tuple iden- 
tifier in the index table. 

■ If the primary location has no synonyms, the tuple ad- 
dressed by the tuple identifier in the primary location 
is accessed and returned. 

■ If synonyms exist, then each one is accessed in turn until 
one is found with a key value that matches the unhashed 
key value of the requested tuple. If the hash index is 
defined with the option to allow duplicate key values, 
then each tuple with a matching key value will be re- 
turned in the order found. 

Independence of retrieval from storage provides HP 
RTDB with some major advantages: 

■ Multiple hash indexes. Each table can have multiple 
hash indexes defined for it. allowing the same table to 
be searched with any number of different keys as shown 
in Fig. 12. 

■ Constant tuple identifiers. A hash index can be rehashed 
without causing any data migration (the data tuple loca- 
tions do not change). This means that applications can 
use direct access by tuple identifier and never be con- 
cerned that rehashing might cause a tuple identifier to 
change. This feature also significantly improves the per- 
formance of applications that frequently update table 
columns used for key values. 

■ Dynamic hash index definition. Unlike direct hashing 
algorithms, hash indexes can be added to or removed 
from existing tables. 

■ Fixed space overhead. The space overhead incurred be- 
cause of defining a hash index is a direct function of the 
number of tuples in a table and does not depend on the 
number of columns, so it does not increase as new col- 
umns are added to a table. 

However, no matter how carefully a hash function is 
designed, it cannot guarantee that collisions will not occur 
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when the function is applied to diverse and unpredictable 
sets of keys. Therefore, when a collision does occur, there 
must be a means of specifying an alternate location in the 
index table where the new tuple identifier can be stored. 
HP RTDB uses a form of separate chaining in which each 
hash index consists of two segments, a primary segment 
and a secondary segment (see Fig. 13). 

The primary segment contains the tuple identifiers of all 
keys that hash to an unused primary location. To reduce 
the probability of collisions, the number of locations in the 
primary segment can be configured by the user to be more 
than the maximum number of tuples in the data table. For 
example, if the number of primary segment locations is 
1.25 times the number of tuples in the data table (primary 
ratio), then the load factor (packing density) of each primary 
segment cannot exceed B0 percent. What this means is that 
for a table with eight tuples the number of primary segments 
is 10 (1.25 x 8), and if there are no collisions, at most eight 
of the tuples in the primary segment will be occupied. A 
higher primary ratio will reduce the probability of colli- 
sions but will increase the primary segment size. Users can 
adjust each index's primary ratio to achieve the best perfor- 



mance and minimum memory consumption. 

The secondary segment contains the tuple identifiers of 
all data values that hash to a primary location that is already 
in use. This segment provides a separate overflow area for 
secondary entries (synonyms), thus eliminating the prob- 
lem of migrating secondaries (existing synonyms that must 
be moved to make room for a new primary entry). The 
secondary segment is allocated based upon the number of 
tuples in the data table and is guaranteed to be large enough 
for even a worst-case index distribution. After a collision 
occurs at a location in the primary segment, the primary 
location becomes the head of a linked-list synonym chain 
for all secondaries that hash to that primary location. 
Input Areas. Input data in a real-time environment may 
be expected or unsolicited, and can arrive in streams, small 
packets, or large bursts. This data may also involve complex 
synchronization of processes to handle the data. In all 
cases, there is a need to respond to the arrival of the new 
data within a predictable time before it is too old to be of 
value, or is overwritten by the next arrival. 

Input areas provide highly efficient buffering for receiv- 
ing and storing unstructured data into the data base. Users 
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can configure a datu base with any number of input areas 
of any size up to the limits of available shared memory. 
Values in input areas can be read or updated either using 
offsets in a named input area or. for even higher perfor- 
mance, using an input area's actual address as if it were a 
local array variable. Like tables, input areas can be 
explicitly locked and unlocked for control of concurrent 
access. 

Data Access 

Traditional data base transactions are not supported in 
HI' RTDB because each access to the data base is considered 
a transaction, and each access is guaranteed to be serialized 
and atomic. However, a system designer can still define 
and implement an application transaction as a set of two 
or more data base accesses, which will complete or fail as 
a group. 

The general data access flow in HP RTDB is shown in 
Fig. 14. The sequence of events to access the data base to 
update a tuple would be: 

■ Obtain the address of the main control block using the 
session identifier (SesslD). The session identifier is re- 
turned to the user when the data base is opened and is 
used in subsequent calls to access the data base. 

■ Obtain the address of the table system table from the 
main control block, and using the table identifier |TblTid| 
obtain the tuple of the user table from the table system 
table. The user is given a table identifier when the table 
is opened. 

■ Obtain the entries in the locks-held and semaphore con- 
trol blocks and lock the user table. The addresses for 
these entries are obtained from the main control block. 

■ Finally, obtain access to the tuple in the user table using 
the user table address obtained from the table system 
table and the tuple identifier (tid). 

This process is the same for input areas, except that the 
input area system table is accessed rather than the table 
system table, and the input area offset is used instead of 
the tuple identifier. 

Performance tests to assess the data access performance 
characteristics of the HP Real-Time Data Base routines were 
run on an HP 9000 Model 825. The benchmark for these 
tests consisted of 56-byte tuples. During the tests, shared 
memory was locked into physical memory. The results of 
these performance tests are summarized in Fig. 15. 
Table Access. There are three methods of locating tuples 
in user tables. Tuples can be read using sequential access, 
hashed key index access, or direct tuple identifier access. 
However, updates and deletions of tuples can only be done 
by direct tuple identifier access. This means that to update 
or delete a tuple, it must first be located by one of the three 
read access methods. Sequential access starts at the begin- 
ning of a table and returns each tuple in the order in which 
it is encountered. Since tuples can be inserted or deleted 
at any location, the physical order of tuples in an active 
table is unpredictable. The user can set up search condi- 
tions for sequential searching, such as the number of com- 
parisons, columns to use, comparison operator (e.g.. EQ. 
NE, etc.). and ASCII type (e.g.. hexadecimal, octal, etc.). If 
the search conditions are specified, then only those tuples 
that meet the conditions are returned. Sequential access is 



the slowest mode of access, but for very small tables of 10 
to 15 tuples, the speed of sequential searching is compar- 
able with indexed searching. Sequential access is most 
appropriate for serially processing most or all of a table's 
data, since it does not use additional memory for index 
tables. 

Indexed access, which uses the indirect hashing technique 
discussed earlier, is much faster than sequential access, 
but still slower than direct access by tuple identifiers. Index 
keys can consist of up to five contiguous or noncontiguous 
table columns of mixed data types and any number of in- 
dexes can he defined for a single table. Although there is 
no hard limit as to how many indexes can be defined for 
a table, each index requires additional memory for index 
tables and additional processing time to create or update 
each index key defined. Indexed access is best for applica- 
tions that need fast, unordered access to data and that 
mainly perform reads and updates rather than insertions 
and deletions. 

The HP RTDB tuple update routine allows three alterna- 
tive courses of action when columns that make up an index 
key value are about to be updated. One option specifies 
that the indexes should be modified (that is, rehashed) to 
reflect the update. A second option is to deny the update 
and return an error when an index key value is about to 
be changed. For top performance, there is an option to 
update a tuple and bypass checking for index modification. 
This option should only be used if the user's application 
can ensure that a tuple's index key values are never changed 
after its initial insertion into a table. 

Direct access by tuple identifier is by far the fastest form 
of access. A tuple's tuple identifier is returned when it is 
first added to a table and also when the tuple is accessed 
by a sequential or indexed search. The returned tuple iden- 
tifier can then be used to update, delete, or reread the same 
tuple directly any number of times since tuple identifiers 
do not change, even when a table is rehashed. This feature 
offers spectacular benefits when a small set of tuples is 
repeatedly accessed. The tuple identifier can be obtained 
once, stored internally, and then used to access the same 
tuples directly in all subsequent operations. 
Input Area Access. Data in input areas can only be read 
or updated. Since input areas are usually updated by being 
overwritten with new input data. IIP RTDB does not pro- 
vide separate routines for deleting and inserting data ele- 
ments in input areas. Nor are these functions really needed, 
since updating an element to a null or non-null value ac- 
complishes the same end. 

Since the structure and content of input areas are appli- 
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cation dependent and can change at any time, HP RTDB 
does not try to map input areas as it does tables. Data 
elements in input areas are accessed by naming the input 
area and specifying the element's offset and length. HP 
RTDB then locates the start of the input area, adds the 
offset, and reads or updates the selected data element. For 
maximum performance, the input area may optionally be 
addressed directly as if it were a local array, but this access 
mode bypasses HP RTDB's address validity checking and 
Concurrency control mechanisms and should only be used 
if the extra performance is critical. The application must 
then ensure correct addressing to avoid data base corrup- 
tion. 

Users can also associate a table column with an input 
area data element by defining the column"s data type as a 
pointer to an input area and then assigning the data ele- 
ment's offset as the column's value. The input area offsets 
shown in Fig. 8 are input area pointer types. 

Configuration and Creation 

Much effort was made to keep the process of configuring 
and creating a data base as simple and flexible as possible. 
The final definition of data base objects intended for future 
implementation can be deferred until they are actually 
needed, and other data base objects can be added and/or 
removed after the data base is created. Also, all data base 
configuration and maintenance functions can be done with 
the interactive HP RTDB query'debug commands as well 
as by calling HP RTDB subroutines. This allows users to 



prototype and test data base designs without writing a 
single line of program code. 

Defining the Data Base Schema. The first step in creating 
an HP RTDB data base is to define the system limits of the 
data base, that is. the data base name, the configuration 
file name, the maximum number of tables, input areas, 
indexes, columns, and sessions that will be accommodated 
by the data base at any time. The user may choose lo defer 
the actual definition of some data base objects until a later 
time, but must inform HP RTDB of how many of each object 
may eventually be defined. This information is used to 
ensure that the system tables and control structures for the 
data base are as large as the maximum requested. Preallo- 
cation of contiguous storage for the maximum size of the 
data base objects instead of allocating on an as-needed basis 
eliminates the overhead of checking for available space 
when adding tuples to a table. It also eliminates the over- 
head associated with dynamically allocating and de- 
allocating blocks in a multiuser environment. 

When the system limits are defined, a skeletal schema 
and control structures are generated in shared memory and 
saved on disc in the configuration file. At this point the 
data base schema can be filled out with the definition of 
user tables, columns, input areas, and indexes either 
through query/debug commands or by calls to the HP RTDB 
subroutines (MdDefTbi. MdDefCoi. MdDeflx. and MdDetIA). As 
these other data base objects are defined, the information 
about them is entered into the system tables and the schema 
grows more complex. However, no storage is allocated for 
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nfiwly defined data base objects until the data base is built. 
The user can at anytime save a copy of the current memory- 
resident schema to the configuration file. When the data 
base is fully operational and contains data, the data as well 
as the schema can be saved in the same file. 
Building a Data Base in Memory. Once the system limits 
of the data base are set and the schema defined, the data 
base can be built. First, the schema must be loaded into 
memory. The schema will already be in memory if the data 
base is newly defined. Otherwise, the schema file on disc 
is opened and loaded into memory (MdOpenDb). Using the 
memory-resident schema data, the HP RTDB build routine 
(MdBuiidDb) allocates shared memory to each data base ob- 
ject, builds and initializes any data or control structures 
associated with the objects, and sets up the logical links 
between the structures. HP RTDB also provides routines 
to calculate the minimum memory needed to build the data 
base from the schema. Additional memory may optionally 
be allocated to allow for future implementation of data 
base objects that are not yet defined in the schema. After 
a data base is built, it is ready to be initialized with appli- 
cation data. 

Locking and Concurrency Control 

There are three components to the synchronization of 
concurrent access to a data base: session management, lock 
management, and semaphore management. As each new 
user process opens the data base, HP RTDB allocates a 
session control block for the new process and the process 
becomes attached to the data base. A session identifier is 
returned to the process for use in subsequent calls to access 
the data base, and the session control block is filled with 
information about the process such as the HP-UX process 
identifier (pid) and user identifier (uid). The session iden- 
tifier is used to index into the locks-held table. With this 
and other data the session manager is able to perform its 
role in controlling concurrent access to the data base. 

Locking in HP RTDB is provided only at the table and 
input area level rather than at the tuple and data item level. 
This coarse granularity of locking is acceptable because in 
a memory-resident data base, locks are normally held for 
very short periods of time. Each object (user table and input 
area) in the data base is owned by a specific semaphore. 
Locking of data base objects is accomplished by acquiring 
the object's semaphore and associating it with the process's 
session identifier. The lock is released by freeing the 
semaphore and breaking the link between the semaphore 
and the session identifier. 

HP RTDB controls the locking and unlocking of 
semaphores, but all queuing and rescheduling of blocked 
processes is handled by the HP-UX operating system. This 
gives HP RTDB a simple, efficient, and reliable concurrency 
control mechanism that is guaranteed to be compatible 
with other HP-UX features. For example, a user could easily 
integrate HP RTDB and the HP real-time extension features 
to implement real-time priorities in an application. HP real- 
time extensions are additions to HP-UX that allow users 
to set high priorities for certain processes. 

If the application does not explicitly lock a data base 
object before trying to access it. the HP RTDB routine called 
to do the access will normally apply an implicit lock of 



the object by default. There are options to allow users to 
read and write through locks, but these options may com- 
promise the integrity of the data base and should be used 
with caution when higher performance is critical. A read- 
through lock allows a high-priority process to access data 
in the data base that may be in the process of being updated 
by another process. 

Security 

HP R TDB provides three levels of security through the 
use of passwords. A password is required to access the 
data base. The password level, which can be data base 
administrator, read-write, or read-only, determines the 
user's access authorization. The data base administrator 
password allows a user process to perform any operation 
on the data base supported by HP RTDB subroutines or by 
the query/debug commands. The read-write password al- 
lows a user process to read, modify, and delete data in the 
data base, but not to perform any of the data base definition 
functions, such as adding or removing a table or index. 
The read-only password allows a user only to read data in 
the data base, but not to delete or modify data, or perform 
any data base definition functions. 

In addition to the password security, the data base ad- 
ministrator (or the root user) can also alter the access per- 
missions of the schema file on disc or the data base's shared 
memory segment to limit access to the data base. 

Backup and Recovery 

Memory-resident data bases are particularly vulnerable 
to power failures and operating system failures because 
both types of failures usually destroy the contents of main 
memory. Battery backup power systems can provide excel- 
lent protection against power failures, but system failures 
pose a problem lhal really has no good solution. 

The traditional backup methodology of logging all 
changes to a disc file cannot be used if high performance 
is desired; yet there is no other way to keep a secure backup 
with close parallel to the state of memory. HP RTDB pro- 
vides a "snapshot" backup which allows each application 
to choose an acceptable trade-off between performance and 
secure backup. 

At any time, the application can call an HP RTDB routine 
to save an image of the data base schema or the schema 
and data to a disc file. For a data base of 34.000 bytes 
consisting of 6 tables and 2 input areas, a single snapshot 
takes about 0.5 second on an HP 9000 Series 825. Snapshots 
can be taken as often or as rarely as the user application 
chooses, and can be triggered periodically or by specific 
events. Users who can afford the overhead can take more 
frequent backups while users who require top performance 
may rarely or never take a backup. In some real-time appli- 
cations, there is little point in taking a backup since the 
data would be obsolete long before the system could be 
restarted. Data base recovery is also very fast but a data 
base can only be recovered to the point where the last 
snapshot was taken. Either the schema or the schema and 
the data can be recovered. Recovery of the schema only 
would create an empty data base which could then be 
reinitialized with data by the application. 
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Query Debug Utility 

The query debug utility is included as part of the HP 
Real-Time Data Base software to provide real-time applica- 
tion developers with a tool to: 

■ Assist with prototyping, testing, and debugging applica- 
tions 

■ Create, modify, and maintain HP RTDB data bases in 
both development and production environments 

■ Use as a simple and flexible HP-UX filter, which when 
combined with other HP-UX commands and utilities, 
can provide useful application functions to HP RTDB 
users without the need for additional code. 

The query'debug utility supports nearly all of the func- 
tionality of the HP RTDB subroutines. However, it is highly 
generalized and is designed to be safe and friendly rather 
than fast. Therefore, most query/debug functions are signifi- 
cantly slower to execute than equivalent subroutine calls. 

The query'debug command syntax is modelled after the 
Structured Query Language (SQL), an ANSI industry-stan- 
dard relational interface. This resemblance lo SQL is in- 
tended only to make it easier for users who are already 
familiar with SQL. The query'debug utility is not intended 
to support the SQL standards of inquiry or reporting func- 
tionality. 

The query/debug utility was designed as an HP-UX filter 
and conforms to the conventions for filters in its use of 
HP-UX input/output files stdin. stdout. and stderr. This allows 
it to be used with input and output redirection, pipes, and 
so on. Output can optionally be produced without headings 
to enable clean output data to be piped directly into other 
filters or user-written programs. 

Query/debug commands can be entered interactively for 
ad hoc work, or can be read from ordinary disc files for 
repetitive tasks. For example, the commands to define and 
initialize a data base could be saved in a disc file to ensure 
that the data base is always recreated the same way. 
Likewise, simple reports can be generated using query' 
debug command files or by combining query/debug com- 
mand files with HP-UX shell scripts and utilities. 

The query/debug commands provide the following func- 
tionality: 



■ Define, reconfigure, build, remove, and back up a data 
base 

■ Change passwords and shared memory security permis- 
sions 

■ Initialize lableand input area values in a new data base 

■ Display data base configuration and status information 

■ Generic add delete, update, and select of tuple values 
based upon indexed or sequential searches 

■ Display or print all or selected data from all or selected 
tuples in a table in either tabular or list format 

■ Generic update, select, and display of input area values 

■ Load or store data base values from or to disc files 

■ Debugging aids such as hexadecimal data display, a 
"peek" function, and an error trace option for tracing all 
errors that may occur during any query debug processing 
of the user's data base 

■ On-line help facility for all query'debug commands 

■ Built-in octal, decimal, and hexadecimal integer cal- 
culator 

■ Execution of HP-UX commands without leaving query' 
debug. 

Conclusion 

The goal of a high-performance data base drove many of 
the design decisions and implementation techniques for 
HP Real-Time Data Base. The performance goals were met 
and exceeded with simple data structures [tables and input 
areas), programming techniques such as macros, and op- 
tions that allow users to eliminate features that affect per- 
formance. The result is a set of routines that enables real- 
time application developers to create custom data bases 
for capturing and retrieving the diverse data structures 
found in real-lime environments. 
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New Midrange Members of the Hewlett- 
Packard Precision Architecture Computer 
Family 

Higher performance comes from faster VLSI parts, bigger 
cache and TLB subsystems, a new floating-point 
coprocessor, and other enhancements. A new 16M-byte 
memory board is made possible by a double-sided surface 
mount manufacturing process. 

by Thomas O. Meyer, Russell C. Brockmann, Jeffrey G. Hargis, John Keller, and Floyd E. Moore 



NEW MIDRANGE HP PRECISION ARCHITECTURE 
computer systems have been added to the HP 9000 
and HP 3000 Computer families. The HP 9000 
Model 835 technical computer and the HP 3000 Series 935 
commercial computer share the same system processing 
unit (SPU). Designed with significantly improved floating- 
point and integer performance, the Model 835/Series 935 
SPU meets the computational needs of mechanical and 
electrical computer-aided engineering (CAE) and multiuser 
technical and commercial applications. 

The HP 3000 Series 935 (Fig. 1) is configured for business 
applications and runs HP's proprietary commercial operat- 
ing system. MPE XL. HP 9000 Model 835 products include 
the Models 835S and 835SE general-purpose multiuser 
computers, the Models 835CHX and 835SRX engineering 
workstations with 2D and 3D (respectively) interactive 
graphics, and the powerful Model 835 TurboSRX 3D solid- 



rendering graphics superworkstation with animation capa- 
bility (Fig. 2). All Model 835 systems run the HP-UX operat- 
ing system. As a member of the HP Precision Architecture 
family, the Model 835/Series 935 SPU supports a wide 
variety of peripherals, languages, networks, and applica- 
tions programs. 

User Requirements 

Like its predecessor, the Model 825/Series 925 SPU, 1 the 
Model 835/Series 935 SPU's definition was driven by re- 
quirements from several different application areas. In ad- 
dition to the requirements of small size, low power dissi- 
pation, low audible noise, flexible I/O configurations, and 
tolerance of a wide range of environmental conditions nor- 
mally required for a midrange technical or commercial 
product, the Model 835/Series 935 SPU design addresses 
several other needs. For scientific computation and me- 
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chanical and electrical CAE applications, high floating- 
point and integer computational performance is desired. 
The Model 835 Series 935 SPU provides more than a 300% 
increase in floating-point performance and more than a 
50% increase in integer performance over the Model 825/ 
Series 925. The Model 835/Series 935 has been bench- 
marked at 14 MIPS and 2.02 MFLOPS.* 

Customers who own or plan to purchase a Model 825 or 
Series 925 want the ability to upgrade to the faster Model 
835/Series 935 without having to replace the whole com- 
puter. To meet this requirement, the Model 835 'Series 935 
processor is designed so that an upgrade can be easily done 
at a customer's site by exchanging two boards. Because of 
HP Precision Architecture compatibility, user applications 
migrate and realize enhanced performance without modi- 
fication or recompilation. 

For all application areas, main memory capacity is an 
important influence on overall system throughput. To meet 
increased memory requirements, a compact, double-sided 
surface mount 16M-byte memory board has been made 
available. Designed to work in any of the Model 825/Series 
925 or Model 835/Series 935 products, this board doubles 
memory capacity to either 96M or 112M bytes depending 
on the configuration. 

Design Overview 

The Model 835/Series 935 uses many of the same compo- 
nents as the Model 825/Series 925 SPU. Common compo- 
nents include the mechanical package, the power supply. 
I/O cards, the I/O expander, the battery backup unit, and 
the memory cards. This high degree of commonality not 
only assures easy upgrade potential but also minimized 
design time. 

'MIPS (million instructions pet secondj peftormance is relative to a Digital Equipment 
Corporation VAX 1 1/780 computer in a single-user multitasking environment, as calculated 
Irom the geometric mean ol a suite ot 1 5 Integer benchmarks MFLOPS (million lloating-poinl 
operations per second) performance is measured using the Unpack benchmark double- 
precision wilh coded Basic Linear Algebra Subprograms (BLAS) 



A block diagram of the Model 835 Series 935 SPU is 
shown in Fig. 3. The two boards unique to this SPU are the 
processor and processor dependent hardware (PDH) boards 
highlighted in the block diagram and shown in Fig. 4. 

The following sections will explain the approaches taken 
to meet the performance requirements mentioned earlier. 
In addition, the design considerations for a compact IBM- 
byte memory board using a new double-sided surface 
mount manufacturing process will be discussed. 

Processor Board 

The Model 835-Series 935 processor board reuses much 
of the technology developed for the Model 825/Series 925, 
a practice frequently called "leverage" within HP. Eight 
VLSI integrated circuits make up the core of the processor 
board: the CPU (central processing unit), the SIU (system 
interface unit), two CCUs (cache controller units|. the TCU 
(TLB controller unit), the FPC (floating-point controller), 
and two commercially available floating-point chips. Of 
these, the CPU, SIU. TCU. and two CCUs are functionally 
identical to those used in the Model 825'Series 925 proces- 
sor but run 20% faster. These parts were designed in HP's 
NMOS-UI VLSI process. 2 - 3 The FPC and the floating-point 
chips, new for the Model 835/Series 935 processor, will be 
discussed later. 

In addition to faster VLSI, a number of performance en- 
hancements over the Model 825/Series 925 processor board 
are found on the Model 835/Series 935 processor board. 
These include: 

■ An eight-times-larger cache (128K bytes by 2 sets, unified 
instructions and data). 

■ A two-times-larger translation lookaside buffer or TLB 
(2K instruction entries and 2K data entries). Since HP 
Precision Architecture defines page sizes to be 2K bytes, 
this allows 8M bytes of main memory to be mapped 
directly into the TLB. 




Fig. 2. The HP 9000 Model 835 
TurboSRX is a 3D solid-rendering 
graphics superworkslation with 
animation capability Like other 
Model 835 multiuser technical 
computers, it runs the HP-UX 
operating system. 
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Fig. 3. Block diagram ot the Model 835/Senes 935 system processing unit. 



■ A new single-chip clock buffer thai replaces over 40 
discrete parts and, along with faster VLSI, allows a 20% 
increase in clock frequency. 

■ A new floating-point coprocessor based on a new NMOS- 
III VLSI floating-point controller (FPC) and two floating- 
point chips. 

The increased cache size and faster cycle time account 
for the increased integer performance and partially account 
for the improved floating-point performance. However, the 
new floating-point system is mainly responsible for the 
greater than 300% improvement in overall floating-point 
performance. 

The performance of a central processing unit can be de- 



scribed by three items: 

■ The amount of work each instruction does 

■ The average number of cycles per instruction executed 

(cpii 

■ The cycle time of the CPU. 

In general, reduced instruction set computers (RISC) 
trade off the first item with the last two. By performing 
extensive simulations to determine the most important in- 
structions and making appropriate trade-offs in instruction 
power versus cycle time. HP has been able to minimize 
the impact of this trade-off. As a result, the HP Precision 
Architecture instruction set is very powerful despite being 
classified as RISC. Consider, for instance, the 14-MIPS rat- 
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ing and the 15-MHz peak instruction rate of the Model 
835 Series 935 SPU. The sustainable instruction rate is ap- 
proximately 10.8 MHz, 4 not considering cache and TLB 
misses. The actual rate will be lower. The reason that the 
SPU is rated at 14 MIPS relative to a Digital Equipment 
Corporation VAX 11 780 when the actual instruction rate 
is less than 10.8 MHz is that for this suite of benchmarks 
an average mix of HP Precision Architecture instructions 
performs more work than an average mix of instructions 
for the VAX 11 780. a complex instruction set computer 
(CISC). 

Since the Model 835/Series 935 uses the same CPU chip 
as the Model 825/Series 925. the improvements in CPI have 
been made external to the CPU. When the CPU executes 
directly from the on-board cache memory, it proceeds at 
nearly the full instruction clock rate. However, when an 
instruction or data is needed that is not in the cache, the 
CPU temporarily halts execution while the needed data is 
fetched from main memory. This has the effect of increasing 
the average number of cycles to execute an instruction. By 
increasing the cache size by a factor of eight and increasing 
the TLB size by a factor of two. the cache and TLB hit rates 
are significantly improved. The result is that the CPU can 
execute directly from the cache a much greater percentage 
of the time. 

The third item, the cycle time of the CPU, has also been 
improved in the Model 835/Series 935. The Model 825/ 
Series 925 processor board clock is designed to provide a 
25-MHz clock rate. Operation beyond 25 MHz would push 
the design beyond original specifications. To overcome this 
limitation, an NMOS-III clock chip designed for an earlier 
HP product has been adapted to meet Model 835/Series 
935 design requirements. As an added benefit, part count 
is reduced by more than 40. freeing up critical space for 
other functions. 

By using the faster VLSI chips and the NMOS-III clock 
buffer, the Model 835/Series 935 processor board runs at 
a frequency of 30 MHz. Operation at 30 MHz provides the 
additional benefit that the CTB (central bus. formerly Mid- 
Bus), which is designed lo operate at any frequency from 
6.67 MHz to 10 MHz. can be run at its peak frequency of 
10 MHz. This is because the CTB frequency is derived from 
the system clock by a simple divide-by-three circuit in the 
SIU. All CTB boards designed to run at the full 10-MHz 
CTB frequency operate in either the Model 825/Series 925 
or the Model 835/Series 935. 

All of these enhancements, of course, don't come free. 
The larger cache and TLB would require almost twice the 
area they occupied on the Model 825/Series 925 processor 
board using standard through-hole printed circuit technol- 
ogy. In addition, to achieve the floating-point performance 
goals, the floating-point chips had to be located on the 
processor board to allow faster communication between 
the FPC and the floating-point chips. (In the Model 825/ 
Series 925 SPU, the floating-point coprocessor is split be- 
tween two boards.) These changes, along with additional 
bypassing and other changes, add three large VLSI chips 
(two are 169 pins each and one is 72 pins) and a total parts 
count increase of 79 parts. 

Examination of the Model 825/Series 925 processor board 
reveals that there is very little room for more parts. To fit 



all the extra components onto the processor board, surface 
mount technology (SMT) is used. SMT is rapidly gaining 
favor as the board technology of choice within HP. largely 
because of its increased density and potential for lower 
manufacturing cost. In addition to fitting the extra 79 parts 
onto an already crowded board. SMT allows more than 
96% of the board to be machine-loaded. 

Design Time 

The Model 835/Series 935 project schedule called tor 
introduction less than one year after the Model 825'Series 
925.The tight schedule depended upon reusing as much 
technology as possible. Significant work had already gone 
into developing tools based on HP's Engineering Graphics 
System (HP EGS). Most of the VLSI chips are unchanged, 
so a significant part of the design is taken directly from 
the Model 825/Series 925 processor board design. Addi- 
tional custom macros were added to HP EGS to speed lay- 
out. The flexibility of HP EGS allowed easy addition of 
SMT capability to the editor. Software tools developed by 
the lab to perform design rule checks directly on photoplot 
output and to compare results to the original schematic 
were enhanced to understand the additional requirements 
of SMT. The extra effort in tool development paid off well. 
The very first printed circuit boards booted the HP-UX and 
MPE XL operating systems without any workarounds. 

Floating-Point Coprocessor 

The Model 825/Series 935 floating-point coprocessor 
provides hardware assistance for floating-point math oper- 
ations and is implemented by a floating-point controller 
(FPG) and two floating-point integrated circuits. One of the 
floating-point ICs, the ALU, performs the addition, subtrac- 
tion, compare, and convert operations, and the other, the 
MPY. performs the multiply, divide, and optional square 
root operations. All floating-point operations can be either 
single-precision or double-precision and fully support tin? 
IEEE 754 floating-point standard. 

The FPC, as the name implies, is the central control 
circuit for the floating-point coprocessor. It interprets the 
floating-point instructions and manages the flow of oper- 
ands and results to and from the floating-point chips. The 
FPC contains twelve 64-bit floating-point registers, a status 
register, seven operation-exception registers, and a config- 
uration register. 

The FPC gets its floating-point instructions and operands 
over the cache bus. Instructions come from the CPU. but 
operands are read into the 12 floating-point registers di- 
rectly from the cache. Double-precision operands require 
three cache bus cycles to transfer the data. The first cycle 
transfers the floating-point load instruction and the next 
two transfer the operand. Single-precision operands re- 
quire only two cache bus cycles. When a floating-point 
operation is begun by the CPU, the operands are loaded 
into the operation-exception registers from the floating- 
point registers to be forwarded to the floating-point chips 
over the 64-bit math bus. Although the FPC has seven 
operation-exception registers, it only uses the first two. 
(The remaining five are for architectural compliance.) 
These registers act as a queue for the operations and also 
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indicate exceptions in the event of a trap. The first register 
contains the currently executing operation, while the sec- 
ond may contain a queued operation waiting to begin. 

Feature Set 

Besides supporting the HP Precision Architecture floating- 
point operations, the FPC has performance enhancements 
that decrease the number of states during which the CPU 
suspends the cache bus while waiting for an FPC instruc- 
tion to be accepted. Three major performance enhance- 
ments designed into the FPC are: 

■ Queuing of two floating-point operations 

■ Register bypassing on interlocking stores 

■ Interlock result bypassing on the math bus. 

The FPC executes floating-point operations in the order 
issued by the CPU. Queuing up at most two floating-point 
operations allows the CPU to issue two floating-point op- 
erations back-to-back without having to wait for the first 
operation to complete before the second operation is ac- 
cepted. Since performance analysis shows that the FPC is 
likely to complete a floating-point operation before a third 
operation is issued by the CPU, the FPC is designed to 
accept only two successive floating-point operations. If a 
third operation is issued to the FPC while there are still 
two operations in the queue, the FPC will suspend the CPU 
until the first floating-point operation is completed. 

When a floating-point store instruction is encountered 
in a program, the data in one of the floating-point registers 
will not be immediately available to send to the cache if 
the store specifies a register that is the target of a pending 
floating-point operation. In this case the FPC will suspend 
the CPU until the operation is done before providing the 
data. The penalty for this interlocked store is reduced two 
states by driving the result onto the cache bus at the same 
time it is being stored into the target floating-point register. 

The third FPC enhancement allows a result on the math 
bus to be reloaded into the math chips as an operand for 
the next operation. This means the FPC does not have to 
reissue the operand over the math bus for the next opera- 
tion. This saves three states in the execution of the follow- 
ing operation for double precision and two states for single. 

Circuit Design 

The FPC is physically divided into ten principal blocks 
(Fig. 5): a cache bus interface, a math bus interface, a register 
stack, interlock logic, two cache bus control programmable 
logic arrays (PLAs), two math bus control PLAs, a test state 
PLA. and a clock generation block. The math bus side of 
the chip includes 64 driver/receiver pairs that are routed 
to the register stack by a 64-bit data bus. and 21 pairs that 
provide various control functions. The cache bus interface 
includes 32 driver/receiver pairs for cache bus data and 27 
pairs for control and identification of transactions and co- 
processor operations. 

An important feature of the FPC is the hardware interlock 
block, which performs two main functions. The first is to 
detect register bypassing opportunities by comparing the 
registers referenced by coprocessor operations and manage 
the bypass data traffic. The second is to determine the 
interlocks for loads and stores, allowing them to be received 
by the FPC and be handled in parallel with arithmetic 



operations if the referenced register is not involved in op- 
erations queued in the FPC. 

A problem frequently encountered in the design of inte- 
grated circuits with wide bus interfaces such as the FPC 
is the generation of noise on the circuit's power supplies 
when many of the chip's pads are driven at the same time. 
Since VLSI packages in a system are decoupled with bypass 
capacitors to provide stable supply levels at the package 
leads, this noise is caused primarily by the parasitic induc- 
tance of the pin-grid array (PGA) package. The magnitude 
of this noise is given by the relationship 



V = L dt' 



where v is the noise voltage generated, L is the parasitic 
inductance of the package, and di/dt is the derivative of 
the current through the package leads. The expression 
suggests the two wavs used to reduce package noise in the 
FPC. 

First, the number of leads devoted to power supplies is 
increased, effectively decreasing the parasitic inductance 
between the internal circuitry and the stable voltage levels 
provided by the printed circuit board decoupling. A high- 
pin-count package, a 272-pin PGA, was chosen to minimize 
this effect for the FPC. Second, current switching (di'dt in 
the inductance voltage expression) can be decreased. This 
is the most important effect for the FPC. To minimize this 
effect, attention is focused on the pad drivers, normally 
the most heavily loaded circuits on an IC. The critical factor 
is not the current capability of these drivers once they are 
fully turned on, but how rapidly they are turned on. Fig. 6 
illustrates the solution to achieving low noise in high- 
current drivers. The final driver, inverter 4 in the chain 
driving the PC board load, is a large driver capable of sink- 
ing or sourcingthe off-chip load. Inverter 3 is much smaller, 
with less than one tenth the drive capability of inverter 4, 
and turns on inverter 4 somewhat slower. This slow turn-on 
of inverter 4 doesn't greatly compromise how quickly the 
off-chip load is driven. It is the final current in inverter 4, 
once it is fully turned on, that is the first-order factor in 
how rapidly the off-chip load is driven. 
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Fig. 5. Block diagram ol the floating-point controller chip. 
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Double-Sided Surface Mount Process 



Requests lor increased component packing aensities by prod- 
uct development labs nave prompted trie development of a fun 
double-sided surface mount process (SMT-2) with the ability to 
place reasonably large ICs on the bottom side of the printed 
circuit board A standard process was developed for use by all 
Hewlett-Packarc surface mount manufacturing sites by the Sur- 
face Mount Development Center ot the HP Computer Manufac- 
turing Division The process was implemented by the Colorado 
Surface Mount Center :o build the i6M-byte memory board de- 
scribed in the accompanying article Other HP manufacturing 
sites are currently installing this process 

Design Requirements 

Requirements for the SMT-2 process were received from prod- 
uct R&D labs and all the surface mount manufacturing sites 
First component packing density improvements were needed 
Up to two times single-sided surface mount densities were re- 
quired to add memory capability reduce printed circuit board 
sizes, and minimize the number of printed circuit assemblies. 

Several requests were made for ICs m SOJ (small-outline J 
lead), SOIC (small-outline IC). and PLCC (plastic leaded chip 
carrier) packages on both sides of the printed circuit board To 
be useful in most applications. 28-pm and sometimes 44-pin ICs 
need to be placed on the bottom side of the board Larger ICs 
could be restricted to the top of the board tor most applications. 

Through-hole components would be required for most as- 
semblies. In general, these would be types of components that 
are still difficult to obtain in SMT packages such as connectors 
and PGAs (pin-grid arrays) 

From a processing standpoint, the SMT-2 process had to be 
compatible with SMT-1, HP's single-sided surface mount pro- 
cess, on the same manufacturing line Minimum additions or 
changes could be made to major pieces of equipment for the 
new process. 



Problems and Solutions 

To develop a process to meet these requirements several 
questions had to be answered The first was now to adhere the 
components on the bottom side while reflowing comoonents on 
the top side Second, how would the through-hole parts be sol- 
dered'' Finally, would the joints formed be as reliable as pints 
on a single-sided surface mount assembly 7 

The alternatives considered for adhering the oottom-side com- 
ponents during reflow of the top side included 

■ Glue one side and reflow both sides at once 

■ Make two passes through reflow using different solder pastes 
that reflow at different temperatures The components on the 
bottom side would be reflowed first using a higher-temperature 
solder paste 

■ Reflow the components on the bottom side first During top- 
side reflow. surface tension would be relied on to keep the 
components on the bottom side from falling off 

The feasibility of relying on the surface tension to hold parts 
has been demonstrated 1 The mass of a plastic leaded chip 
carrier that can be supported by the surface tension of molten 
solder can be simply calculated from a knowledge of the perim- 
eter of each lead wetted by the solder the surface tension of 
molten solder and the number of leads per package This model 
shows that PLCC packages up to 68 pins should not tall off 
Experiments have shown however that problems start to occur 
above 44 pins This is probably because the model does not 
take into account all the factors affecting the parts during reflow 
such as belt vibration and incline Since surface tension will 
support the required components, there is no need for gluing or 
using special solder pastes which would add considerable com- 
plexity to the process 

(continued on next page) 



Risk Reduction and Flexibility 

One of the FPC design features is that it controls floating- 
point chips of varying speeds from different vendors. This 
allows Ihe use of the FPC in the HP 9000 Models 850S and 




GND 



Fig. 6. (.ow-noise, high-current pad driver 



855S as well as the Model 835/Series 935 SPU. 

Several design options have been implemented to allow 
the FPC to be as flexible as possible in accommodating the 
various vendor parts requirements. A configurable math 
count register is included in the FPC to accommodate vari- 
ous time requirements for ALU operations, multiplication, 
division, and square root. This register regulates the 
number of cycles the FPC waits before unloading a result 
from the math chips. It is initialized to a defaull value 
during power-up and can be reset by system software using 
a special implementation dependent instruction. This fea- 
ture had the added benefit of allowing the FPC to be de- 
veloped concurrently with the floating-point chips without 
knowing the final number of cycles needed to complete an 
operation. 

High-Productivity Design 

A number of factors contributed to high design produc- 
tivity for the FPC. One is the extensive reuse of circuits 
first designed for other ICs. The building blocks that went 
into the state machines, clock generation circuitry, cache 
bus interface, and parts of the math bus interface were 
originally designed for chips used in other HP Precision 
Architecture computers. 
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Finding a solution for soldering the through-hole parts turned 
out to be quite difficult The normal solution of wave soldering 
could not be used because of concern about damaging the ICs 
on the bottom of the board Using a solder pot would give selec- 
tive soldering, but was undesirable because each component 
would have to be soldered separately with a special fixture A 
smgle-pomt applicator could be developed to apply paste one 
point at a time, but would take considerable hardware and 
software development and would be difficult to use with parts 
like PGAs unless the paste was applied from the bottom side 
of the board or before the parts were inserted Hand soldering 
suffers from extensive labor and poor quality. Solder preforms 
give fairly good soldering results but many vendors did not have 
the capability to provide them and the incremental cost was quite 
high 

The alternative that gave the best results and caused the least 
change in the current process was a stenol-reflow process first 
suggested by AMP 2 In this process, solder paste is stenciled 
into the plated-through holes at the same lime that paste is sten- 
ciled onto the surface mount pads lor the top side of the boards. 
The through-hole components are loaded alter the surface mount 
components and reflowed wilh either vapor phase or infrared 
Critical to forming a good solder fillet is getting enough paste lo 
provide 1 10% to 140% fill of the hole This is done by enlarging 
the holes in the stencil to deposit extra paste on the board surface 
around the plated-through holes and reducing the plated-through 
hole sizes to about 0.007 to 0 01 0 inch larger lhan the component 
leads 

The linal process is: 

■ Bottom side: stencil, place surface mount components, and 
reflow 

■ Top side: stencil (for surface mount and through-hole compo- 
nents), place surface mount components, place through-hole 
components, and reflow. 

Reliability testing included standard product environmental 
and strife tests for the Model 835/Series 935 memory board plus 
accelerated shock and vibration testing of single-sided and 
double-sided test boards for comparison. Since most of this test- 



ing was done using vapor phase reflow, additional testing was 
done to compare vapor phase and infrared rellow for solder joint 
reliability This was done by subjecting the boards lo a predeter- 
mined level of destructive random vibration 1 1n these tests, there 
was no difference between single-sided and double-sided boards 
there were no failures in product testing of the 16M-byte memory 
board on production prototype and pilot run boards, and infrared 
was statistically better than vapor phase reflow. 

Summary 

As a result of the dual-reflow process, component packing 
density has been increased to about twice the single-sided sur- 
face mount density, with ICs as large as 44-pin PLCCs being 
reliably placed on the bottom side. Process and product strife 
testing has shown no difference in reliability from single-sided 
surface mount technology. Since the surface mount process has 
been changed very little from the single-sided process, the pro- 
cessing cost of a double-sided board is less than that of two 
single-sided assemblies because some of the processes are 
only done once (e.g., clean, depanel. test. etc.). 

Acknowledgments 

I would like to extend appreciation and recognize the following 
individuals for their contributions to the design and implementa- 
tion of the SMT-2 process: Jim Baker. Mai Chow, Conrad Taysorn. 
and Keith Walden 

Andy Vogen 

Project Manager 
Surface Mount Development Center 

References 

1 D W Rice. Oautile-Sided Surface Attachment A Guide lo final Component Mass 

Molten Solder Will Support. Internal HP Paper, March 9. 1987 

1 M Rupert. Design Characteristics ol a Surface Mounl Compatible Through-Hole 

Connector " SMTA Expo Las Vegas. Nevada October '6-29. 1987 

3 C Taysorn Report on me Results ol the IR fie/tow Acceptance Testing Using F- 16 

Double-SideO Soar* Internal HP Paper. October 14. 19BB 



Once the len principal blocks of the FPC were designed, 
the 781 signals interconnecting these blocks were routed 
automatically with almost 100% success. This was accom- 
plished by careful planning of these blocks with constraints 
on their design to facilitate automatic routing. Part of the 
routing productivity results from addition of a third level 
of metallization to the existing NMOS-III process. This 
coarse interconnect level is reserved for power and clock 
distribution, saving the finer interconnect levels forsignals. 

Testing 

The FPC is the first HP Precision Architecture coproces- 
sor designed to handle more than one operation at a time. 
This approach, while increasing performance, also leads 
to a more complex design and a more difficult testing prob- 
lem. Unlike the pipeline inside the CPU, which controls 
when the next operation will arrive, the coprocessor can 
receive a new floating-point instruction from the CPU at 
any time. The arrival of this instruction can occur just as 
the FPC is shifting its internal queue, discovering a trapping 
condition, or performing some other complex transaction. 
The number of combinations of conditions is extremely 
large and verifying correct operation under all conditions 
is difficult. 



To combat this problem in the testing of the FPC, a test 
program was developed that runs sequences of three float- 
ing-point instructions each, with a variable number of non- 
floating-point instructions inserted between each sequence 
to create all possible timing conditions. Each test sequence 
is run once on the hardware, then emulated in software 
and the results compared. Any differences indicate a failure 
in the hardware. This test program in its final version runs 
more than one billion test sequences of three floating-point 
instructions each. 

Processor Dependent Hardware 

The Model 835/Series 935 processor dependent hard- 
ware (PDH) board is similar to its counterpart in the Model 
825/Series 925. It provides the processor dependent hard- 
ware interface to the processor board and contains either 
one or two VLSI channel adapters depending on options 
the customer has selected. The primary difference is that 
the board is designed to run at 10 MHz on the CTB. 20% 
faster than the Model 825/Series 925 board. 

The HP CIO channel adapters are the same VLSI design 
used in other HP Precision Architecture computers except 
for their higher operating frequency capability. The stan- 
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dard adapter provides the interface between the CTB and 
the CIO bus in the SPU. The optional second CIO channel 
adapter provides an interface to an external CIO expander 
without consuming a CTB slot in the SPU. 

The addition of a second CIO channel adapter chip to 
the PDH board requires that the two channels share buffers 
on the CTB side (see Fig. 3). This is possible because the 
32 address and data signals from each channel can be di- 
rectly tied together: only one channel's signals are active 
at a time. However, the signals that control the direction 
of the CTB buffers cannot be tied directly together. These 
signals are NANDed together on the PDH board, electrically 
separating the control signals from each channel and 
eliminating the possibility of driver conflict. Pull-up resis- 
tors are added to several of the second channel's signals, 
thus guaranteeing the state of these signals even in the 
absence of the second channel. This allows the PDH board 
to be built without the second CIO channel adapter, provid- 
ing a version of the Model 835/Series 935 with only an 
internal channel for customers who do not require a CIO 
expander. 

An 8K X 8 static RAM chip is included on the PDH board 
as nonvolatile memory and is used to store operating sys- 
tem panic and loading information. It is kept alive by the 
same circuit that provides backup power to the real-time 
clock. The circuit will provide power from either the pow- 
erfail backup system or the SPU clock battery when main 
power is not available. 

16M-Byte Memory 

The design challenge for the 16M-byte memory board 
shown in Fig. 4 was to double the capacity of the already 
dense memory subsystem in a fixed volume with a minimal 
increase in power consumption. That challenge has been 
met by packaging 144 lM-bit DRAM chips, one 272-pin 
PGA VLSI memory controller, one 100-pin connector, vari- 
ous control and buffering logic chips, and the necessary 
bypass capacitors on a 6. 75-by-7. 25-inch printed circuit 
board (roughly half the size of this page). 

To allow interchangeability. the 16M-byte memory board 
is the same size as the current 8M-byte memory board. 1 
Increasing the memory to 16M bytes requires an extra 72 
DRAMs and their bypass capacitors on the bottom side of 
the board. The bottom-side mounting of the DRAMs, which 
are packaged in 0.300-inch SMT packages, required the 
development of a new double-sided surface mount man- 
ufacturing process (SMT-2) and a new approach to printed 
circuit board design and verification. For details, see "Double- 
Sided Surface Mount Process" on page 23. 

The circuitry of the 8M-byte memory board released with 
the Model 825/Series 925 was designed to allow future 
expansion to 16M bytes with only minor modifications. 
Slight modifications were made to allow either 8M or 16M 
bytes to be loaded at the factory using the same board. The 
sM-byte version simply omits the bottom-side DRAMs and 
bypass capacitors. The NMOS-III VLSI memory controller 
was designed to support 2M, 4M, 8M. or IBM bytes, includ- 
ing single-bit error correction, double-bit error detection, 
and support for battery backup of the memory contents in 
case of main power failure. 



Although very little new electrical design was required 
in the 16M-byte memory board, time was spent in verifying 
operational parameters and increasing the manufacturabil- 
ity of the board. It was designed with all the test points on 
the bottom side to be electrically tested in an HP 3065 
single-sided probe fixture in manufacturing. This design 
allows a less expensive and more reliable test fixture and 
the use of the same fixture for both the 8M-byte and IBM- 
byte versions. 

Since the 16M-by1e board was the first design to use HP's 
new SMT-2 process, the printed circuit board development 
software was modified to evaluate and verify the design of 
the board before sending data out for board fabrication. 
The flexibility of HP EGS allowed easy addition of the 
necessary features to route circuitry to bottom-side compo- 
nents and verify SMT-2 design rule compliance. 
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Data Compression in a Half-Inch 
Reel-to-Reel Tape Drive 

A proprietary data compression algorithm implemented in 
a custom CMOS VLSI chip improves the throughput and 
data capacity of the HP 7980XC Tape Drive. 

by Mark J. Bianchi, Jeffery J. Kato. and David J. Van Maren 



HP 7980 TAPE DRIVES are industry-slanclard, hall- 
inch, reel-to-reel, streaming tape drives that operate 
at 125 inches per second, have automatic tape load- 
ing, and can be horizontally rack-mounted for better floor 
space utilization. 1 They are available in a variety of config- 
urations and support three industry-standard tape formats: 
800 NRZI. 1600 PE, and 6250 GCR. 

The HP 7980XC Tape Drive is a new member of this 
family. Its special contribution is its use of a sophisticated 
real-time data compression scheme that provides extended 
performance to the 6250 GCR format. 

The implementation of data compression in the HP 
7980XC involves two different but complementary compo- 
nents. The first component is the data compression engine. 
This engine resides in the HP 7980XC and consists of a 
proprietary integrated circuit and support circuitry. The 
second component is a packing process, referred to as 
super-blocking, that is performed on the data packets that 
have been compressed by the compression engine. Super- 
blocking is performed in the firmware that resides in the 
HP 7980XC drive. When these two components are com- 
bined, the resulting compression scheme provides high 
tape compaction. Tape compaction is a figure of merit for 
compression performance. It is the ratio of the amount of 
tape used by a standard 6250 GCR half-inch tape drive to 
that used by the HP 7980XC in compression mode. It is a 
higher ratio than that for compression alone, since super- 
blocking provides additional tape savings. This article ad- 
dresses the design and implementation of data compression 
in the HP 7980XC. For more detailed information on super- 
blocking, see the article on page 32. 

The Data Compression Engine 

The performance improvement in the HP 7980XC is pro- 
vided by the Hewlett-Packard data compression (HP-DC] 
subsystem. This subsystem can both compress and decom- 
press the data being passed through it. Fig. 1 shows how 
the HP-DC engine fits architecturally into the data path of 
the HP 7980XC Tape Drive. The data compression or de- 
compression occurs between the interface hardware and 
the cache buffering hardware. When data is written to the 
tape drive, it flows from the interface to the HP-DC subsys- 
tem where it is compressed and packed, and then proceeds 
to the cache buffer, where it is queued to be written to the 
tape. Conversely, when data is read from the tape drive, it 
proceeds from the buffer to the HP-DC subsystem, where 
it is unpacked and decompressed, and then to the interface 



and the host computer. 

Data Compression Development 

Development of the Hewlett-Packard data compression 
algorithm began at HP Laboratories, where the basic data 
structures for the algorithm were developed. Years of work 
culminated in an algorithm design that is similar to the 
widely known public-domain version of the Lempel-Ziv 
algorithm, 2,1 but offers distinct advantages. It is adaptive, 
and it is more flexible and offers better performance than 
the public-domain Lempel-Ziv scheme. 

The HP-DC algorithm was presented to the Greeley Stor- 
age Division in the form of an algorithm-based Pascal pro- 
gram. To realize this algorithm in silicon, a number of 
changes were made to the program so that, once im- 
plemented in hardware, the algorithm would still provide 
the high throughput needed by the HP 7980XC's high-speed 
data path. A state-machine simulator was used to bench- 
mark the performance of the integrated circuit and verify 
the data integrity of the algorithm. This simulator was then 
used to architect and design the proprietary IC. 

The Data Compression Algorithm 

The underlying principle behind data compression is 
the removal of redundancy from data. The HP-DC scheme 
performs this by recognizing and encoding patterns of input 
characters. Each time a unique string of input characters 
occurs, it is entered into a dictionary and assigned a 
numeric value. Once a dictionary entry exists, subsequent 
occurrences of that entry within the data stream can be 
replaced by the numeric value or codeword. It should be 
noted that this algorithm is not limited to compressing 
ASCII text data. Its principles apply equally well to binary 
files, data bases, imaging data, and so on. 

Each dictionary entry consists of two items: (1) a unique 
string of data bytes that the algorithm has found within 
the data, and (2) a codeword that represents this combina- 
tion of bytes. The dictionary can contain up to 4096 entries. 
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Fig. 1. HP 7980XC data path architecture. 
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The first eight entries are reserved codewords that are used 
to flag and control specific conditions. The next 256 entries 
contain the byte values 0 through 255. The remaining loca- 
tions are linked-list entries that point to other dictionary 
locations and eventually terminate by pointing at one of 
the byte values 0 through 255. Using this linked-list data 
structure, the possible byte combinations can be anywhere 
from 2 bytes to 128 bytes long without requiring an exces- 
sively wide memory array to store them. 

In the hardware implementation of the HP-DC scheme, 
the dictionary is built and stored in a bank of random-access 
memory (RAM) that is 23 bits wide. Each memory address 
can contain a byte value in the lower 8 bits, a codeword 
or pointer representing an entry in the next 12 bits, and 
three condition flags in the upper 3 bits. The codewords 
range in length from 9 bits to 12 bits and correspond to 
dictionary entries that range from 0 to 4095. During the 
dictionary building phase, the first 512 entries have 9-bit 
codewords, the next 512 entries have 10-bit codewords, 
the next 1024 entries have 11-bit codewords, and the final 
2048 entries have 12-bit codewords. Once the dictionary 
is full, no further entries are built, and all subsequent 
codewords are 12 bits in length. The memory address for 
a given dictionary entry is determined by a complex oper- 
ation performed on the entry value. Since the dictionary 
can contain 4096 entries, it would appear that 4K bytes of 
RAM is all that is needed to support a full dictionary. 
However, in practice, more than 4K bytes of RAM is needed 
because of dictionary "collisions" that occur during the 
dictionary building phase. When a dictionary collision oc- 
curs, the two colliding values are recalculated to two new 
locations and the original location is flagged as a collision 
site. 

An important property of the algorithm is the coupling 
between compression and decompression. In the HP-DC 
IC. these two operations are tied together both in the com- 
pression and decompression processes and in the packing 



and unpacking of codewords into a byte stream. The nature 
of the compression algorithm requires that the compression 
process and the decompression process be synchronized. 
Stated differently, decompression cannot begin at an arbi- 
trary point in the compressed data. It begins at the point 
where the dictionary is known to be empty or reset. This 
coupling provides one of the fundamental advantages of 
the HP algorithm, namely that the dictionary is embedded 
in the codewords and does not need to be transferred with 
the compressed data. Similarly, the packing and unpack- 
ing process must be synchronized. This implies that com- 
pressed data must be presented to the decompression hard- 
ware in the proper order. 

A Data Compression Example 

Fig. 2 is a simplified graphical depiction of the compres- 
sion algorithm implemented in the HP-DC compression 
engine. This example shows an input data stream com- 
posed of the following characters: RINTINTIN. To 
follow the flow of the compression process. Fig. 2 should 
be viewed from the top to the bottom, starting at the left 
and proceeding to the right. It is assumed that the dictionary 
has been reset and initialized to contain the first 256 entries 
of 0 to 255. The dictionary must always be initialized in 
this way to satisfy the requirements of the algorithm's data 
structure. 

The compression algorithm executes the following pro- 
cess with each byte in the data stream: 

1. Get the input byte. 

2. Search the dictionary with the current input sequence 
and. if there is a match, get another input byte and add it 
to the current sequence, remembering the largest sequence 
that matched. 

3. Repeat step 2 until no match is found. 

4. Build a new dictionary entry of the current "no match" 
sequence. 

5. Output the codeword for the largest sequence that 
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matched. The following lines of code are an algorithmic 
representation of these steps: 

currenLbyte.sequence GETJNPUTJ3YTE; 
REPEAT 
REPEAT 

matched : = SEARCH_DICTIONARY(current_byte.sequence. 

returned.codeword): 

IF (matched - TRUE) THEN 
BEGIN 

longest_byte_sequence : = currenLbyte .sequence; 
longest_codeword := returned_codeword; 
current_byte_sequence := current_by1e_sequence + 

GETJNPUT_BYTE; 

END; 

UNTIL (matched - FALSE); 
BUILD_DICTIONARY(current_byte_sequence); 
OUTPUT_CODEWORD(longest_codeword); 
current_byte_sequence : = current, byte.sequence - 

longesl_byte_sequence ; 

UNTIL (no more input bytes to compress); 

In this example, the compression algorithm begins after 
the first R has been accepted by the compression engine. 
The input character R matches the character R that was 
placed in the dictionary during its initialization. Since 
there was a match, the engine accepts another byte, this 
one being the character I. The sequence RI is now searched 
for in the dictionary but no match is found. Consequently, 
a new dictionary entry RI is built and the codeword for 
the largest matching sequence (i.e., the codeword for the 
character R) is output. The engine now searches for I in 
the dictionary and finds a match just as it did with R. 
Another character is input (N) and a search begins for the 
sequence IN. Since IN does not match any entries, a new 
one is built and the codeword for the largest matching 
sequence (i.e., the codeword for the character I) is output. 
This process continues with a search for the letter N. After 
N is found, the next character is input and the dictionary 



is searched for NT. Since this is not found, a dictionary 
entry for NT is built and the codeword for N is output. The 
same sequence occurs for the characters T and I. A 
codeword for T is output and a dictionary entry is built 
lor Tl. 

Up to this point, no compression has occurred, since 
there have been no multiple character matches. In actuality, 
the output stream has expanded slightly, since four 8-bit 
characters have been replaced by four 9-bit codewords. 
|That represents a 32-bit to 36-bit expansion, or a 1.125:1 
expansion ratio.) However, after the next character has been 
input, compression of the data begins. At this point, the 
engine is searching for the IN sequence. Since it finds a 
match, it accepts another character and begins searching 
for INT. When it doesn't find a match, it builds a dictionary 
entry for INT and outputs the previously generated 
codeword for the sequence IN. Two 8-bit characters have 
now been replaced by one 9-bit codeword for a compression 
ratio of 16/9 or 1.778:1. 

This process continues and again two characters are re- 
placed with a single codeword. The engine begins with a 
T from the previous sequence and then accepts the next 
character which is an I. It searches for the TI sequence and 
finds a match, so another byte is input. Now the chip is 
searching for the TIN sequence. No match is found, so a 
TIN entry is built and the codeword for TI is output. This 
sequence also exhibits the 1.778:1 compression ratio that 
the IN sequence exhibited. The net compression ratio for 
this string of 9 bytes is 1.143:1. This is not a particularly 
large compression ratio because the example consists of a 
very small number of bytes. With a larger sample of data, 
more sequences of data are stored and larger sequences of 
bytes are replaced by a single codeword. It is possible to 
achieve compression ratios that range from 1:1 up to 1 10:1. 
Th6 performance section of this article presents measured 
compression ratios for various computer systems and data 
types. 

A simplified diagram of the decompression process im- 
plemented in the HP-DC IC is shown in Fig. 3. This example 
uses the output of the previous compression example as 
input. The decompression process looks very similar to 
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the compression process, but the algorithm for decompres- 
sion is less complicated than that (or compression, since 
it does not have to search for the presence of a given dic- 
tionary entry. The coupling of the two processes guarantees 
the existence of the appropriate dictionary entries during 
decompression. The algorithm simply uses the input 
codewords to look up the byte sequence in the dictionary 
and then builds new entries using the same rules that the 
compression algorithm uses. This is the only way that the 
decompression algorithm can recover the compressed data 
without a special dictionary sent with each data packet. 
The following lines of code represent the algorithm used 
by the decompression process: 



built. The next codeword that is input represents the byte 
sequence IN. The decompression engine uses this 
codeword to reference the second dictionary entry, which 
was generated earlier in this example. This entry contains 
the byte value N. which is placed on the output stack, and 
the pointer to the codeword for I. which becomes the cur- 
rent codeword. This new codeword is used to find the next 
byte (I), which is placed on the output stack. Since this is 
a root codeword, the look up process is complete and the 
output stack is dumped in reverse order, that is. 1 is output 
first, followed by N. The same process is repeated with the 
next two codewords, resulting in the recovery of the orig- 
inal byte sequence R I N T I N T I N. 



currenLcodeword := GETJNPUT_CODEWORD; 
REPEAT 

codeword = currenLcodeword. 

REPEAT 

byte : = LOOKUP_DICTIONARY(codeword): 
PLACE_BYTE_ON_OUTPUT_STACK(byte). 
FIND_NEXT_ENTRY_IN_LIST(codeword. pointer Jo„next 
entry); 

codeword '." pointer _to_nexLentry: 
UNTIL (codeword points to tail ol lisl one of bytes 0-255); 

BUILD_DICTIONARY(previous_codeword.byte); 

REPEAT 

outpuLbyte .= POP_ BYTE_FROMLOUTPUT_STACK; 
OUTPUT_BYTE(oulpuLbyte): 
UNTIL (stack is empty); 

previous_codeword := currenLcodeword; 
currenLcodeword := GET_INPUT_ CODEWORD; 

UNTIL (no more input codewords to decompress); 

As in the compression example, it is assumed that the 
dictionary has been resel and initialized to contain the first 
256 entries of 0 to 255. The decompression engine begins 
by accepting the codeword for R. II uses this codeword to 
look up the byte value R. This value is placed on the last-in. 
first-OUt (LIFO) stack, wailing to be output from the chip. 
Since the R is one of the root codewords (one of the first 
256 entries), the end of the list has been reached for this 
codeword. The output stack is then dumped from the chip. 
The engine then inputs the codeword for I and uses this 
to look up the byte value [, Again, this value is a root 
codeword, so the output sequence for this codeword is 
completed and the byte value for I is popped from the 
output stack. At this point, a new dictionary entry is built 
using Ihe last byte value that was pushed onto the output 
stack (I) and the previous codeword (the codeword for R). 
Each entry is built in this manner and contains a byte value 
and a pointer to the next byte in the sequence (the previous 
codeword). A linked list is generated in this manner for 
each dictionary entry. 

The next codeword is input (the codeword for N) and 
the process is repeated. This time an N is output and a 
new dictionary entry is built containing the byte value N 
and the codeword for I. The codeword for T is input, caus- 
ing a T to be output and another dictionary entry to be 



Data Compression Hardware 

Fig. 4 shows a block diagram of the HP-DC engine subsys- 
tem. The heart of the engine is a custom VLSI chip de- 
veloped using a proprietary HP CMOS process. This chip 
can perform both compression and decompression on the 
data presented to it. However, only one of the two processes 
(compression or decompression) can be performed at any 
one time. Two first-in. first-out (FIFO) memories are located 
at the input and the output of the chip to smooth out the 
rate of data flow through the chip. The data rate through 
the chip is not constant, since some data patterns will take 
more clock cycles per byte to process than other patterns. 
The instantaneous data rate depends upon the current com- 
pression ratio and the frequency of dictionary entry colli- 
sions, both of which are dependent upon the current data 
and the entire sequence of data since the last dictionary 
reset. The third section of the subsystem is a bank of static 
RAM that is used for local storage of the current dictionary 
entries. These entries contain characters, codeword point- 
ers, and control flags. 

Fig. 5 shows a block diagram of the HP-DC integrated 
circuit. The HP-DC chip is divided into three blocks: the 
input/output converter (IOC), the compression and decom- 
pression converter (CDC), and the microprocessor interface 
(MPI). These blocks are partitioned for effective manage- 
ment of the boundary conditions of the algorithm. Each 
block is well-defined and the coupling between blocks is 
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Fig. 4. HP-DC engine block diagram. 
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very low. As a result, each of the blocks runs independently 
of the other two. This results in maximum chip perfor- 
mance. 

The MPI section provides facilities for controlling and 
observing the chip. It contains six control registers, eight 
status registers, two 20-bit input and output byte counters, 
and a programmable automatic dictionary reset circuit. The 
control and status registers are accessed through a general- 
purpose 8-bit microprocessor interface bus. The control 
registers are used to enable and disable various chip fea- 
tures and to place the chip into different operating modes 
(compression, decompression, passthrough, or monitor). 
The status registers access the 20-bit counters and various 
status flags within the chip. 

During the development of the HP-DC algorithm, it was 
found that compression ratios could be improved by reset- 
ting the dictionary fairly frequently. This is especially true 
if the data stream being compressed contains very few simi- 
lar byte strings. Frequent dictionary resets provide two 
important advantages. First, resetting the dictionary forces 
the codeword length to return to 9 bits. Second, new dic- 
tionary entries can be made that reflect the present stream 
of data (a form of adaption). The HP-DC chip's interface 
section contains circuitry that dynamically monitors the 
compression ratio and automatically resets the dictionary 
when appropriate. By writing to an interface control regis- 
ter, this circuitry can be programmed to reset automatically 
at a wide range of compression ratio thresholds. Another, 
faster, reset point when the data is expanding guarantees 
a better worst-case compression ratio, which in turn pro- 
vides a level of expansion protection. Most data compres- 
sion algorithms will expand their output if there is little 
or no redundancy in the data. 

The IOC section manages the process of converting be- 
tween a byte stream and a stream of variable-length 
codewords (ranging from 9 bits to 12 bits). Two of the eight 
reserved codewords are used exclusively by the IOC. One 
of these codewords is used to tell the IOC that the length 
of the codewords must be incremented by one. With this 
function controlled by a codeword in the data stream, the 
process of incrementing codeword size is decoupled from 
the CDC section. The IOC operates as an independent 
pipeline process, thus allowing the CDC to perform com- 
pression or decompression without being slowed down by 
the IOC. Another benefit of using a reserved codeword to 
increment the codeword size is that any future HP-DC en- 
gines that have larger codeword sizes will be backward 
compatible with this HP-DC engine. 

The second reserved codeword alerts the IOC that the 
next codeword is the last one associated with the current 
packet of data. From this information, the IOC knows to 
finish its packing routine and end on a byte boundary. This 
feature allows compression of multiple input packets into 
one contiguous output packet while maintaining the ability 
to decompress this packet into its constituent packets. The 
IOC is also capable of allowing data to pass straight through 
from input to output without altering it. and of allowing 
data to pass through while monitoring the potential com- 
pression ratio of the data. These features can be used as 
another level of expansion protection. 

The CDC section is the engine that performs the transfor- 



mation from uncompressed data to compressed data and 
vice versa. This section is composed of control, data path, 
and memory elements that are finely tuned for maximum 
data throughput. The CDC interfaces with the IOC via two 
12-bit buses. During compression, the IOC passes the input 
bytes to the CDC section, where they are transformed into 
codewords. These codewords are sent to the IOC where 
they are packed into bytes and sent out of the chip. Con- 
versely, during decompression the IOC converts the input 
byte stream into a stream of codewords, then passes these 
codewords to the CDC section, where they are transformed 
into a stream of bytes and sent to the IOC. The CDC section 
also interfaces directly to the external RAM that is used to 
store the dictionary entries. 

The CDC makes use of two reserved codewords. The first 
is used any time a dictionary reset has taken place. The 
occurrence of this codeword causes two actions: the IOC 
returns to the state in which it packs or unpacks 9-bit 
codewords, and the CDC resets the current dictionary and 
starts to build a new one. Dictionary resets are requested 
by the MPI section via microprocessor control or the auto- 
matic reset circuitry. The second reserved codeword is gen- 
erated during compression any time the CDC runs out of 
usable external RAM while trying to build a new dictionary 
entry. This event very rarely happens, given sufficient ex- 
ternal RAM. However, as the amount of memory decreases, 
it is more likely that the CDC will encounter too many 
dictionary collisions and will not be able to build new 
dictionary entries. With the reduction of external memory 
and the inevitable increase in dictionary collisions, the 
data throughput and compression performance will be 
slightly degraded. The HP-DC chip supports three different 
memory configurations, so a subsystem cost-versus-perfor- 
mance trade-off can be made with regard to individual 
system requirements. This full-dictionary codeword is also 
used during decompression by the CDC to ensure that the 



Microprocessor Bus Data In Data Out 



1 I I 



Microprocessor 
Interface 



Control 




f> 


Input/Output 




Converter 


^ Status 






IOC 



Com p ress ton ' D*com presston 
Converter 



rm 



External Memory Interface 

Fig. 5. HP-DC chip block diagram. 



30 HEWLETT-PACKARD JOURNAL JUNE 1969 



© Copr. 1949-1998 Hewlett-Packard Co. 



decompression process stops building dictionary entries 
at the same point as the compression process. 

Compression Performance Results 

The two most important performance measures for the 
HP-DC engine are data throughput and data compression 
ratio. Throughput performance is measured as the data rate 
that can be sustained at the uncompressed side of the HP- 
DC engine (i.e.. by the host device). This data rate is primar- 
ily dependent upon the compression ratio of the data with 
some minor dependency upon the data pattern. During 
compression, the HP-DC engine will have a minimum 
throughput of 1.0 Mbytes/ s and can achieve a maximum 
of 2.4 Mbytes/s. During decompression, the HP-DC engine 
will have a minimum throughput of 1.1 Mbytes/s and can 
achieve a maximum of 2.0 Mbytes, s. The worst-case through- 
put occurs when the input data is completely random and 
as a result is expanding. In any case, the compressed data 
rate is equal to the uncompressed data rate divided by the 
compression ratio. 

The second performance measure and perhaps the most 
important one is the data compression ratio for various 
data types. This performance was measured by compress- 
ing real user backup data from a variety of computer sys- 
tems. The table below is a summary of the compression 
ratios achieved by the HP-DC engine using this data. The 
test setup included HP 7980A and HP 7980XC half-inch 
tape drives. All of the test data was copied from various 
backup tapes to the HP 7980XC in compression mode, then 
read back and verified while monitoring the compression 
ratio of the HP-DC engine alone. The article on super-block- 
ing (see page 32) discusses the effects these compression 
ratios have on the actual tape compaction ratios. 

Summary of Data Compression Benchmark Results 
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Data Description 

MPE/MPE XL on HP 3000s 
Series 68 (HP Desk) 
Series 68 (Data Basel 
Series 68 (Misc. Data) 

Series 70 (Manufacturing) 
Series 930 (Code) 
HP-UX on HP 9000s 
Series 800 (Commercial 

HP-UX) 
Series 500 (Code) 
Series 500 (Data Base) 
Series 500 (VLSI) 
Series 300 (Archive) 

DEC 
DEC VAX (Code) 

HP 9000 Running 
Pascal O.S. 

Series 200 (Misc. Data) 
Amdahl 

Amdahl (HP Corporate 

Data! 



Volume Compression 
(Mbytes) Ratio 
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2924 
1559 
2924 
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336 
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329 
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5000 
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4.30 
4.31 
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Maximizing Tape Capacity by 
Super-Blocking 

Interrecord gaps on the tape limit the capacity improvement 
attainable with data compression in the HP 7980XC Tape 
Drive. Super-blocking eliminates most of these gaps. 

by David J. Van Maren, Mark J. Bianchi, and Jeffery J. Kato 



SUPER-BLOCKING is a proprietary Hewlett-Packard 
method for maximizing half-inch tape data capacity. 
This capacity improvement is achieved by the re- 
moval of some of the interrecord gaps ordinarily placed 
between host data records. It is performed in real time by 
the firmware residing in the cache buffer of the HP 7980XC 
Tape Drive. 

To understand how super-blocking works, one must un- 
derstand the general format of half-inch tapes. When a 
packet of data is sent from a host to a tape drive, the tape 
drive must place this packet on the tape in such a way that 
it can recover the packet and return it to the host exactly 
as it was received. Normally, physical gaps are placed on 
the tape between each data record. These gaps, which are 
areas on the tape containing no flux reversals (and con- 
sequently no data), guarantee that the data packets can later 
be individually recovered. This format of data records with 
interrecord gaps is required to maintain compatibility with 
the ANSI standard for 6250 GCR tapes. 

A typical host will send data records to the drive in sizes 
that range from 4K bytes to 32K bytes. Assuming that a tape 
is written at 6250 bytes per inch, a typical record will be 
between 0.65 inch and 5.25 inches long. The minimum 
interrecord gap length is approximately 0.3 inch. From these 
numbers, one can see that a tape written with 4K bytes of 
data will contain 69% host data and 31% blank tape. This 
means that about one third of the tape is wasted by inter- 
record gaps. 

Super-blocking is a formatting technique that removes as 
many as possible of these capacity-limiting gaps while re- 
taining enough information to separate individual records. 
This process will pack together as many records as it can 
without exceeding a maximum super-block length of 60K 
bytes. Included at the end of each super-block is information 



that is used in the data retrieval process to separate the 
super-block into its original records. A graphical illustration 
of the super-blocking process is shown in Tig. 1. 

Fig. 2 demonstrates the effect that decreasing the record 
size has upon overall tape capacity. As the size of the data 
records gets smaller, there is a corresponding decrease in 
the amount of data that can be stored on one tape. The 
advantage of super-blocking is that it makes the tape capac- 
ity independent of record size. The effect of super-blocking 
is to minimize the portion of tape capacity lost to interrecord 
gaps. For example, a normal tape written with 16K-byte 
records will waste 12.5M bytes compared to a super-blocked 
tape. 

What Fig. 2 does not show is the effect on capacity of 
file marks. A file mark is a special pattern written on the 
tape that denotes a break in the host data. A file mark uses 
a very small portion of tape. However, there is an additional 
gap for each file mark. Because of this extra gap, super- 
blocking also absorbs all file marks and keeps track of where 
they were originally located. For simplicity, it is assumed 
thai the ratio of the number of file mark gaps to the number 
of data record gaps is typically very small. Therefore, the 
effect on tape capacity of the absorption of file marks will 
not be considered in this article. One should note that the 
advantage of super-blocking for increased tape capacity 
would only improve for each file mark requested by the 
host. 

As explained in the article on page 26, the HP 7980XC 
Tape Drive is capable of performing data compression on 
the data that it receives. Referring to F'ig. 2. a tape written 
with 16K-bvte records will contain 154M bytes of host data. 
If this data were compressed by the HP 7980XC and exhib- 
ited a compression ratio of 4:1, one would expect the tape 
capacity to increase by a factor of four to 61 6M bytes. How- 
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Fig. 1. The super-blocking pro- 
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Fig. 2. Tape capacity versus data record size for a 2400-foot 
tape written at 6250 bpi with 0 3-inch gaps. 



ever, this is not the case, since only the physical record 
size is reduced in proportion to the compression ratio. 
Thus the original 16K-byte records are indeed 4K bytes 
long after compression, but the expected 616M-byte tape 
capacity is only 471M bytes, which is 24% less. It is to 
prevent this effective loss of capacity that super-blocking 
is needed. 

Using the example of 16K-byte records compressed to 
4K-byte records, the effect of super-blocking can readily 
be seen. The compressed 4K-byte records are super-blocked 
and packed into 60K-byte records instead of being written 
directly to the tape. This results in a tape capacity of 666M 
bytes instead of 471 M bytes. This is a capacity improve- 
ment of approximately 41.5%. By combining data compres- 
sion with super-blocking, the limitations that the half-inch 
tape format imposes on data compression are overcome. 
In addition to obtaining the full benefit of data compres- 
sion, super-blocking further improves the tape capacity. 
The table below demonstrates how super-blocking affects 
this example: 



Condition 
(16K-Byte Input Records) 

No Data Compression or 

Super-Blocking 
Super-Blocking Only 
4:1 Data Compression 

Only 

4:1 Data Compression and 
Super-Blocking 



Tape Capacity 
(MBytes) 

154 

166 
471 

666 



Tape 
Compaction 

1.00:1 

1.08:1 
3.06:1 

4.32:1 



Fig. 3 illustrates the combination of data compression 
and super-blocking implemented in the HP 7980XC. 

Complications 

Implementing this concept of super-blocking in the HP 
7980XC was made more complex by the constraints im- 
posed by the host interface, the drive mechanism, and the 
industry standards for the half-inch tape format. The phys- 
ical format of a super-blocked, data-compressed tape writ- 
ten by the HP 7980XC does not violate the ANSI 6250 GCR 
specification, but the logical meaning of the data is changed. 
This means that another 6250 GCR tape drive can read a 
compressed tape, but only an HP 7980XC will be able to 
decipher the data that was sent by the original host, This 
does not preclude the HP 7980XC's being used for data inter- 
change with other GCR drives, since the drive can easily be 
configured to write the data it receives in an uncompressed 
format, just as any other 6250 GCR drive would do. 

Since the physical specifications of the 6250 GCR format 
are maintained on a compressed tape, a method for differen- 
tiating a compressed tape from a normal GCR tape was 
needed. The method chosen to accomplish this is to write 
special noncompressed records at the beginning of a com- 
pressed tape. Whenever a tape is loaded into an HP 7980XC. 
the drive automatically searches for these records. If they 
are not found, the tape is treated as a normal uncompressed 
tape. If they are found, the tape is recognized as compressed 
and the? drive separates the super-blocks and decompresses 
the records before sending them to the host. 

Another complication stems from the embedded gaps 
and file marks within a super-block. To execute the tvpical 
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host command to space to a record or file, all super-blocks 
must be read and processed to determine the location of 
the next record or file. This is not a problem when the tape 
is moving forward, since no performance penalty is incurred 
by reading the data instead of spacing over it. However, 
since the HP 7980 family of drives cannot read data when 
the tape is moving in reverse, reverse record/file spacing 
becomes much more complicated. Super-blocks on the tape 
must first be backed over and then read in the forward 
direction. Hypothetically, if a backspace file command 
were issued near the end of the tape and the beginning of 
the preceding file was very near the beginning of the tape, 
all of the super-blocks on the tape would have to be backed 
over and then read, a situation that might be intolerable. 

The backspace file problem is solved by recording in 
each super-block the running count of how many super- 
blocks have been written since the last file mark was writ- 
ten. This provides the information needed to determine 
how many records can be safely backed over without miss- 
ing the file mark. Thus, single backspace file commands 
can be executed efficiently. The backspace record com- 
mand does not negatively impact performance because the 
previous record is typically within the current super-block 
or the preceding one. 

Another issue that had to be addressed was overwriting. 
This occurs when a host writes and fills the entire tape, 
rewinds the tape, and then writes a directory at the begin- 
ning of the tape, expecting the rest of the previous writes 
to remain intact. This practice is strongly discouraged for 
sequential access devices, but does occur. If it is done, it 
invalidates the backspace file information in some of the 
super-blocks. This is because extra records and/or file 
marks are put back onto the tape after the previous back- 
space file information was written. 

To support this activity, a physical tape mark is written 
to the tape whenever the host switches from writes to any 
other tape motion command. If a tape mark is encountered 
during backspacing, it indicates that some data has been 
previously overwritten. The backspace operation must read 
the super-block in front of the tape mark because the pre- 
vious information used in the backspace file command may 
have been corrupted by an overwrite condition. By reading 




this super-block, the tape drive gets accurate information 
regarding the start of the file. 

Results 

The true figure of merit for the HP 7980XC in compres- 
sion mode is the observed tape compaction ratio. This ratio 
combines the benefits of the HP data compression al- 
gorithm with the advantages of super-blocking. The tape 
compaction ratio is equal to the compression ratio of the 
host data times the super-blocking advantage factor (SAF). 
The SAF is dependent upon the average host data record 
size. A graph of SAF versus average record size is shown 
in Fig. 4. The compression ratio is a function of the amount 
of redundancy exhibited by the host's data. 

The following table shows the data compression bench- 
mark results previously outlined on page 31 with the over- 
all tape compaction results obtained with an HP 7980XC 
Tape Drive. 

Summary of HP 7980XC Tape Compaction Results 



DhIh Description 


Volume 


Compression 


Tape 




(Mbytes) 


Ratio (alone] 


Compact 


MPE/MPE XL on HP 3000s 








Series 68 (HP Desk) 


528 


3.93 


4.35 


Series 68 (Data Base) 


2924 


4.31 


4.83 


Series 68 (Misc. Data] 


1559 


4.30 


5.04 


Series 70 








(Manufacturing) 


2924 


4.31 


4.83 


Series 930 (Code) 


nil 


3.44 


3.97 


HP-UX on HP 9000s 








Serins 800 


226 


2.06 


2.73 


(Commercial HP-UX] 








Series 500 (Code) 


363 


2.38 


2.57 


Series S00 (Data Base) 


336 


4.07 


4.39 


Series 500 (VLSI) 


785 


2.52 


3.34 


Series 300 (Archive) 


329 


2.30 


3.05 


DEC 








DEC VAX (Code) 


423 


2.31 


2.65 


HP Series 200 Running 








Pascal O.S. 








Series 200 (Misc. Data) 


467 


2.47 


2.67 


Amdahl 








Amdahl [Corporate Dalai 


5000 


3.79 


3.86 
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Fig. 4. Super-blocking advantage factor (SAF) versus data 
record size lor a tape written at 6250 bpi with 0.3-inch gaps. 
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High-Speed Lightwave Component 



Analysis 

A new analyzer system performs stimulus-response testing 
of electrical-to-optical, optical-to-electrical, optical-to- 
optical, and electrical-to-electrical components of high- 
speed fiber optic communications systems. 

by Roger W. Wong, Paul Hernday, Michael G. Hart, and 
Geraldine A. Conrad 



HIGH-SPEED FIBER OPTIC COMMUNICATIONS 
systems have emerged over the last half decade to 
compete with other forms of communications sys- 
tems as a cost-effective means of moving information. A 
decade ago. the possibility of a commercially installed 500- 
Mbit/s fiber optic system seemed remote. Today, not only 
are many fiber optic systems operating at hundreds of mega- 
bits per second, but pilot systems are being installed that 
operate at 1.7 to 2.4 gigabits per second. The trend toward 
higher system bit rates places more demand upon the light- 
wave component designer to optimize the performance of 
each device within the high-speed lightwave system. 

Lightwave System Challenges 

Fig. 1 shows the typical functional blocks in a fiberoptic 
communication system. The high-speed portions of the 
lightwave system are the preamplifier, the directly mod- 
ulated laser, the optical fiber, the photodiode receiver, and 
the postamplifier. As systems transmit higher bit rates, each 
of the components needs to be designed to meet the higher 
speed requirements. However, with the higher speeds, op- 
timization of signal transmission through the various de- 
vices becomes more challenging and the interactions of 
various components become more evident and difficult to 
minimize. 

Fig. 2 shows some of the typical challenges the high- 
speed component designer encounters as systems move to 
gigabit-per-second transmission rates. As in lower-bit-rate 
systems, optical power budgets are affected by the insertion 
loss of the optical fiber, connectors, and splices. In the 
higher-bit-rate systems (>500 Mbits/s), interactions be- 
tween the high-speed devices are significant. Often more 
extensive analysis and device characterization are required 
to optimize the electrical and optical interfaces between 
these high-speed components in a systematic way. 

For example, electrical mismatches between the laser 
and its preamplifier or between the photodiode and its 
postamplifier can affect the modulation transfer function 
and the cumulative modulation bandwidth. Also, light re- 
flected back into the laser source affects its modulation 
transfer characteristics and the system signal-to-noise ratio. 

Lightwave Component Analyzer Systems 

Fig. 3 shows the key instruments that form the HP 8702A 



Lightwave Component Analyzer measurement systems. 
Three basic systems are offered: 

■ Modulation capability to 6 GHz at 1300 nm 

■ Modulation capability to 3 GHz at 1300 nm (high dynam- 
ic range) 

■ Modulation capability to 3 GHz at 1550 nm (high dynam- 
ic range). 

Each HP 8702A system consists of a lightwave source, a 
lightwave receiver, the lightwave component analyzer, and 
a lightwave coupler. Fig. 4 shows the HP 83400 family of 
lightwave sources and receivers, which are important ele- 
ments of the measurement systems. More information on 
these sources and receivers can be found in the article on 
page 52. 

The system measures the modulation transfer function 
of a device under test and provides the modulation amp- 
lit ude and phase response of that device. The input or stimu- 
lus signal can either be a radio frequency (RF) signal or a 
modulated optical signal, and the output or response signal 
can either be an RF signal or a modulated optical signal. 
Thus, the device under test (DUT) can be an electrical-to- 
electrical. electrical-to-optical, optical-to-electrical, oropti- 
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Fig. 1. Functional blocks in a typical liber optic communica- 
tions system. 
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Fig. 2. Effects ol component char- 
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cal-lo-optical device, depending upon the measurement 
block diagram employed and the calibration procedure 
used. Table I shows typical examples of each device type. 
By adding an optical signal separation device, such as a 
lightwave directional coupler, the system can be configured 
to measure optical reflections in a wide variety of optical 
devices, such as optical fiber components, connectors, anti- 
reflection coatings, bulk optic devices, and so on. Moreover, 
if there are multiple reflections in a device, each reflection 
can be located very accurately. Multiple reflections can be 
resolved when they are only centimeters apart. 

In this article, lightwave component analysis means the 
capability to characterize a given device in terms of its 
modulation transfer function, electrical driving-point im- 
pedance, optical return loss, and length, as appropriate to 
the device type. 

Lightwave Component Analyzer Operation 

The hardware design of the HP 8702A Lightwave Com- 
ponent Analyzer is virtually identical to that of the HP 
8753B RF Network Analyzer. However, the HP 8702A has 
operating features that make it more appropriate for light- 
wave measurements. 

The HP 8702A consists of three main subsystems that 
tie together the lightwave measurement system: an RF 
source, RF receivers, and processing/display (see Fig. 5). 
The lightwave measurement system is analogous to a light- 
wave communication system. The HP 8702A performs the 



Table I 

Types of Lightwave Devices 



Electrical 

(RF) 

Input 



Optical 

(Modulated) 

Input 



Electrical 
(RF) 
Output 

Electrical-to-Electrical 
Devices 

■ Amplifiers 

■ Coaxial Cables and 
Passi ve Com ponents 

■ Kepeater Links 

Optical-to-Electrical 
Devices 

■ PINPhoIodiodes 
i Avalanche Pholodiodes 
i Optical Receivers 



Optical 
(Modulated) 
Output 

Electrical-to-Optical 
Devices 

■ Laser Diodes and 
LEDs 

■ Optical Sources 

■ Optical Modulators 

Optical-to-Optical 
Devices 

■ Optical Fibers. 
Passive Components, 
Attenuators 

i Optical Modulators 
i Regenerators 



functions of an information source and an information re- 
ceiver. The data processing subsystem uses this informa- 
tion to measure the modulation transfer characteristics of 
lightwave components. 

Signals used to modulate a lightwave source are pro- 
duced by a synthesized RF source in the HP 8702A. The 
RF source provides linear, logarithmic, and list frequency 




Fig. 3. HP 8702 A Lightwave Com- 
ponent Analyzer systems consist 
of a lightwave source, a lightwave 
receiver, the analyzer, and a light- 
wave coupler Three Dasic sys- 
tems have modulation capability 
to 6 GHz at 1300 nm or to 3 GHz 
at 1300 or 1550 nm. 
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sweeps from 300 kHz to 3 GHz with 1-Hz resolution. Power 
and CW sweeps may also be generated. The source is phase 
locked to the R receiver channel, which is described below. 
The HP 8702A provides the power supply for lightwave 
sources and receivers. 

Demodulated signals from a lightwave receiver are mea- 
sured by three 300-kHz-to-3-GHz, tuned RF receivers in 
the HP 8702A. The receivers* bandwidths are extended to 
6 GHz with Option 006. Meaurements of electrical devices 
have a dynamic range of over 100 dB. A portion of the R 
receiver signal is used to phase lock the source to the refer- 
ence channel. Input signals are sampled and down-con- 
verted to a 4-kHz IF. The 4-kHz IF signals for the A. B. and 
R inputs are converted into digital words by the analog-to- 
digital converter (ADC). 



Fig. 4. The HP 83400 family of 
lightwave sources and receivers 

The data processing flow from the ADC to the display 
is shown in Fig. 6. The digital filter performs a discrete 
Fourier transform (DFT) on the digital words. The samples 
are converted into complex number pairs. The DPI - filter 
shape can be altered by changing the IF bandwidth. De- 
creasing the IF bandwidth is an effective technique for 
noise reduction. A reduction of the IF bandwidth by a 
factor of ten lowers the measurement noise floor by appox- 
imately 10 dB. 

Ratio calculations are performed next, if the selected 
measurement is the ratio of two inputs, that is. A/R, B/R, 
or A/B. The ratio is formed by a simple division operation. 

The sampler/lF correction operation is applied next. This 
process digitally corrects for frequency response error, 
primarily sampler roll-off. in the analog down-conversion 



RF 
Out 
300-kHz-IO- 
3-GHz *~ 
Source 

Output 



X. 



4 ®<S 

♦ 1ft In l 









ALC 









•4 

3.8 to 6.8 
GHz 



Phase-Locked Loop 






Frequency 


1 


Main 


Reference 


1 


Processor 




Synthesizer 
Pulse 
Generator 



Multiplexer 



I 




Digital IF 



Same as Input R 



ROM RAM 



I/O 



Display 



Fast 
Processor 



Fig. 5. HP 8702A Lightwave Component Analyzer block diagram The hardware is essentially 
the same as the HP 8753B RF Network Analyzer. 
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path. 

Sweep-to-sweep averaging is another noise reduction 
technique. This involves taking the complex exponential 
average of several consecutive sweeps weighted by a user- 
specified averaging factor. Each new sweep is averaged 
with the previous result until the number of sweeps equals 
the averaging factor. Doubling the averaging factor reduces 
the noise by 3 dB. This technique can only be used with 
ratio measurements. 

The raw data arrays store the results of all of the preced- 
ing data processing operations. All processing up to this 
point is performed in real time by the fast digital signal 
processor shown in Fig. 5. The remaining operations are 
performed asynchronously by the main processor. These 
arrays can be stored to an external disc drive and can be 
accessed directly via the HP-IB (IEEE 488, IEC 625). 

Vector error correction is performed next, if a measur- 
ment calibration has been performed and correction is 
turned on. Error correction removes repeatable systematic 
errors (stored in the error coefficient arrays) from the raw 
arrays. This can vary from simple vector normalization to 
full two-porl (12-term) error correction. Correction for the 
various types of lightwave measurements is described in 
more detail below. 

The results of error correction are stored in the data arrays 
as complex number pairs. The data arrays can he stored to 
disc and accessed via the HP-IB. If the dala-to-memory 
operation is performed, the data arrays are copied into 
the memory arrays. The memory array is also externally 
accessible. 

The trace math operation selects either the data array, 
the memory array, or both to continue flowing through the 
data processing path. In addition, the complex ratio of the 
two (data/memory) or the difference (data - memory) can 
also be selected. If memory is displayed, the data from the 
memory arrays goes through the same data processing flow 



path as the data from the data arrays. 

Gating is a digital filtering operation associated with 
time-domain transform (Option 010). Its purpose is to re- 
move unwanted responses isolated in time. In the time 
domain, this can be viewed as a time-selective bandpass 
or band-stop filler. 

The delay block involves adding or subtracting phase in 
proportion to frequency. This is equivalent to extending 
or shortening the electrical length in the measurement path 
or artificially moving the reference plane. 

Conversion, if selected, transforms the measured s-pa- 
rameter data to the equivalent complex impedance or ad- 
mittance values, or to inverse s-parameters. 

The transform operation converts frequency-domain in- 
formation into the time domain when time-domain trans- 
form is enabled (Option 010 only). The results resemble 
time-domain reflectometry (TDR) or impulse-response 
measurements. The transform employs the chirp-Z inverse 
Fourier transform algorithm. Windowing is a digital filter- 
ing operation that prepares the frequency domain data for 
transform to the time domain. The windowing operation 
is performed on the frequency-domain data just before the 
transform. 

Formatting converts the complex number pairs into a 
scalar representation for display, according to the selected 
format. Formats include log magnitude in dB. phase, and 
group delay. Polar and Smith chart formats retain complex 
data for display on real and imaginary axes. 

Smoothing is another noise reduction technique. When 
smoothing is on, each data point in a sweep is replaced by 
the moving average value of several adjacent points. The 
number of points included depends on the smoothing aper- 
ture, which is selected by the user. The result is similar to 
video filtering. 

The results at this point in the data processing chain are 
stored in the format arrays. Marker values, marker func- 
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Fig. 6. HP 8702A Lightwave Component Analyzer data processing flow diagram 
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tions. and limit testing are all derived from the format 
arrays. The format arrays can be stored to an external disc 
drive and can be accessed via the HP-IB. 

The offset and scale operations prepare the formatted 
data for display on the CRT. This is where the reference 
line position, reference line value, and scale calculations 
are performed as appropriate to the format and graticule 
type. 

The display memory stores the display image for presen- 
tation on the CRT. The information here includes grati- 
cules, annotation, and softkey labels in a form similar to 
plotter commands. When hard-copy records are made, the 
information sent to the plotter or printer is taken from 
display memory. 

The HP 8702A can be connected with an s-parameter 
test set (HP 85046 A) to make electrical reflection measure- 
ments, such as s-parameters s n and s 22 (return loss and 
impedance). An HP 85047A S-Parameter Test Set can be 
used to measure modulation transfer function to 6 GHz. 

Firmware Features 

The principal contributions of the HP 8702A are its 
firmware enhancements. The firmware was developed 
using the HP 8753A RF Network Analyzer as a platform. 
The HP 8702A firmware contains the features of the HP 
8753A as well as new features specific to lightwave measure- 
ments. The most significant enhancement is the ability to 
perform measurement calibration of lightwave components. 

The measurement calibration process consists of measur- 
ing a characterized standard and using it to measure an 
unknown device. The firmware contains a mathematical 
model of the calibration standard and the model's param- 
eters. Data from measurement of the standard is used with 
the calibration models to remove systematic errors from 
measurements of the test device. Lightwave measurements 
are also scaled to proper units for the particular component 
type. 

Calibration of optical devices is performed using through 
connections and known reflections as standards. Calibra- 
tion is done for transmission measurements by connecting 
the lightwave source to the lightwave receiver with the test 
device removed. Reflection measurements require a known 
reflection as a calibration standard. For example, the Fres- 
nel reflection occurring at the test port connector of a light- 
wave coupler is a repeatable and convenient reflection stan- 
dard (3.5%). 

Calibrated lightwave receivers and calibrated lightwave 
sources are used as standards for electrooptical and op- 
toelectrical test device measurements. The calibration pro- 
cess is the same for both types of devices. Calibration infor- 
mation is provided in two forms. The first form is a digitized 
modulation frequency response of the standard. This infor- 
mation is read by the analyzer from a disc provided with 
each calibrated source and receiver. The second is a curve 
fit of the frequency response. Coefficients are entered by 
the user into the analyzer using values printed on each 
lightwave source and receiver. 

Calibration of electrical devices is the same as in most 
HP network analyzers. Calibration kits containing standard 
devices are available for several different connector types. 
Typical standards include shorts, opens, and loads. 



Time-domain transform, an optional feature of the HP 
8702 A. is an extremely powerful tool in lightwave measure- 
ments. Data measured in the frequency domain is converted 
to the time domain using a chirp Fourier transformation 
technique. The resulting time scale is extremely accurate 
and stable because of the synthesized frequency sweep. 
Measurements of distance can be derived from transmis- 
sion or reflection measurements using the index of refrac- 
tion or velocity factor of the test device. The HP 8702A 
has an enhancement to assist in setting transform parame- 
ters. The distance range and resolution of the transformed 
data depend on the width of the frequency-domain sweep 
and the number of data points. The transform parameters 
feature assists the user by displaying range and resolution 
values as sweep parameters are set (Fig. 7). 

Measurement Concept 

The lightwave component analyzer measurement con- 
cept is shown in Fig. 8. The information source provides 
a sine wave whose amplitude and phase characteristics are 
known. This signal serves as the modulation signal to the 
lightwave source (transmitter). The output signal of the 
transmitter is an intensity modulated optical carrier at a 
fixed wavelength. The intensity modulation (i.e.. ampli- 
tude modulation) envelope of the lightwave signal is pro- 
portional to the radio frequency sine wave information 
signal. Because the laser lightwave source is dc-biased in 
the linear region of its optical-power-versus-input-current 
characteristic, the average optical power from the lightwave 
source is the same whether or not a modulation signal is 
present. 

The intensity modulated signal from the lightwave 
source is transmitted through the optical medium, most 
commonly optical fiber, although it could be an open-beam 
environment. The lightwave receiver demodulates the in- 
tensity modulated lightwave signal and recovers the 
sinusoidal RF envelope, which is proportional to the sine 
wave from the information source. The demodulated signal 
is compared in magnitude and phase to the original signal 
by the HP 8702A analyzer. 

The optical signal incident upon an optical device under 
test is of the form: 



TRANSFORM PARAMETER 
RANGE 

RESPONSE RESOLUTION 

TRANSFORM SPAN 
RANGE RESOLUTION 



TRANSFORM MODE 
START FREQUENCY 
STOP FREQUENCY 
FREQUENCY SPAN 
NUMBER Of POINTS 
INDEX Of REFRACTION 
PULSE WIDTH 

SOURCE POWER 
SWEEP TIME 
IF BANDWIDTH 



Channel 1 



13.6905 m 
133.79 mm 



40 ns 

A 1 . 067 mm 



BANDPASS 
300 kHz 
3 GHz 

2.9997 GHz 

201 

1 . 46 

651.55 pa 

O dBm 
BOO ma 
3000 Hz 
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up the instrument lor the optional time-domain transform. 
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Fig. 8. Measurement concept. 
The system compares transmitted 
and received sine wave modula- 
tion superimposed on the 1300- 
nm or 1550-nm optical carrier. 



f(t) = a(t)cos(wt) 

where a(t) is the RF modulation signal and cos(u>t) repre- 
sents the lightwave carrier signal at a given wavelength. 

The device under test operates on the amplitude of both 
the modulation envelope and the carrier signal identically 
and delays both signals by identical amounts, yielding the 
following relationship for the DUT output: 

g(t) = |H|a(t + At)cos((u(t + At)). 

where |H| is the magnitude of the transfer function of the 
DUT. At = #/u>, and <t> is the phase of H. 

The impact of the DUT on the carrier can be determined 
by measuring the modulation envelope. Basically, the mea- 
surement process consists of two steps: (1) calibration of 
the system, and (2) measurement of the DUT. This measure- 
ment process is essentially a substitution method. The sys- 
tem is calibrated by measuring a known quantity and then 
the DUT is substituted for the known device and measured. 



optical modulators, and photodiodes. The slopes of the 
curves at points (I a ,PJ and |P|,.1|,) define the slope respon- 
sivities of the electrical-to-optical and optical-to-elet;trical 
devices. r s and r r . respectively, as shown. 
For electrical-to-optical devices: 



AP„ = r 5 Al, 



(1) 



where AP„ is the peak-to-peak optical power swing, r s is 
the slope responsivity of the eleclrical-to-optical device in 
W/A, and AI, is the peak-to-peak RF current swing. 
For optical-to-electrical devices: 



t. AP„ 



(2) 



where \\ , is the output peak-to-peak RF current swing, r r 
is the slope responsivity of the optical-to-electrical device 
in AAV, and P„ is the peak-to-peak optical power swing. 

The relationship between the device slope responsivities 
and RF current gain can be derived from equations 1 and 2: 



Electrooptical Calibration Theory 

Two important electrooptical devices are lasers and 
photodiodes. Measurements of their modulation transfer 
characteristics and modulation bandwidths are of primary 
interest for the design of high-speed lightwave communica- 
tions systems. The development of electrooptical calibra- 
tion routines for measuring electrooptical devices such as' 
lasers and photodiodes was a significant challenge. 

Fig. 9 shows the relationship between optical power and 
RF current for typical electrooptical devices, such as lasers, 



AW AI, = r s r r . (3) 

Equation 3 forms the basis for the electrooptical calibra- 
tions and allows the measurement of an electrical-to-opti- 
cal device separately from an optical-to-electrical device, 
which is one of the contributions of the measurement sys- 
tem. 

For each HP lightwave source or receiver, a calibration 
data disc is provided, which contains the device's slope 
responsivity. modulation amplitude frequency response, 



P„ 




Fig. 9. Relationship between opti- 
cal power and RF current for typ- 
ical electrooptical devices 
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and modulation phase frequency response. This data disc 
can be downloaded into the analyzer as part of the elec- 
trooptical calibration procedure. 1 The calibration data is 
traceable to an internal HP optical heterodyne system 
called the Optical Heterodyne Calibration System. 

Laser Bandwidth and Power Compression 
Measurements 

Fig. 10 shows the measurement block diagram for lasers 
and other electrical-to-optical devices. In this configura- 
tion, laser diode and/or laser transmitter characteristics 
such as responsivity. modulation bandwidth, modulation 
phase or deviation from linear phase, and modulation 
power compression can be measured. 

A commercially available 1-Gbit/s lightwave transmitter 
is used as an example. The laser was dc-biased in the linear 
range of its optical-power-versus-input-current curve 
(about 15 mA above its threshold) and modulated with an 
incident RF power of approximately + 10 dBm. The mea- 
sured laser responsivity (0.34W/A or -9.35 dB) and mod- 
ulation bandwidth (about 600 MHz| are shown in the top 
trace in Fig. 11. 

The built-in inverse Fourier transform feature of the HP 
8702 A allows modulation frequency response data to be 
converted to the equivalent step or impulse response. For 
the above example, where the laser was operating in the 
linear region, an equivalent step response with rise time 
information can be calculated and displayed, as shown in 
the bottom trace in Fig. 11. Notice that the transmitter's 
frequency response is peaked by about 2.5 dB at 428 MHz. 
This accounts for the underdamped time-domain step re- 
sponse. 

To illustrate the modulation power compression mea- 
surement, a commerically available electrical-to-optical 
converter with an internal laser preamplifier was selected. 
The same block diagram as shown in Fig. 10 was used. 
The analyzer has the ability to change the RF signal power to 
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Fig. 10. Measurement block diagram for lasers and other 
electrical-to-optical devices. 

the DUT over a 25-dB range at a fixed modulation fre- 
quency. New calibration routines were developed thai 
allow the input RF power and modulated optical power 
measurement planes to be referenced at the connectors of 
the device under test. Fig. 12 shows two measurements. 
The top trace shows transmitter modulated optical power 
out as a function of RF power into the transmitter. The 
bottom trace shows transmitter responsivity as a function 
of RF power incident to the transmitter with the modulation 
frequency fixed at 400 MHz. The top curve shows that the 
transmitter has a compressed modulated optical power of 
-1.44 dBm (or 0.72 mW peak to peak) with an incident 
RF power of 1.8 dBm. The bottom curve shows the trans- 
mitter responsivity linearity and compression characteris- 
tics. The small-signal responsivity is 0.38W/A (or - 8.3 clB) 
and compresses by 1 dB at - 4.4 dBm incident RF power. 

Laser Reflection Sensitivity Measurements 

Most high-speed lasers are sensitive to back-reflected 
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light. The reflected light can couple back into the laser's 
cavity, be reamplified. and change the laser's modulation 
transfer characteristics. The fine-grain ripple that often re- 
sults is called reflection noise. Fig. 13 shows the measure- 
ment setup to characterize the change in the laser's modula- 
tion transfer function and bandwidth when different levels 
of light are intentionally reflected back into the laser. 

The directional coupler test port (point A) is where the 
reflected light condition is developed. The modulation fre- 
quency response is referenced to a condition in which no 
incident light at point A is reflected back toward the laser 
under test, that is. an optical load is created at point A. 
This reference condition is normalized to 0 dB, When the 
reflection condition is changed, the resulting measurement 
shows the deviation or change of the laser's modulation 
response for that reflection condition (a laser reflection 
sensitivity measurement). An example of such a measure- 
ment of a commercially available laser transmitter is shown 
in Fig. 14. The worst response represents the condition 
when approximately 95% of the light was reflected back 
to the laser under test. The improved response was 
achieved when a polarization controller was inserted be- 
tween the test port and the 95% optical reflector and the 
polarization of the reflected light was adjusted to minimize 
the response roll-off. 

Photodiode Measurements 

Measurements characterizing optical-to-electrical de- 
vices, such as photodiodes and lightwave receivers, are 
similiar to laser measurements. The measurement block 
diagram is shown in Fig. 15. 

Two-Port Optical Device Measurements 

The loss, gain, and modulation bandwidth of any two- 
port optical device can be measured using the measurement 
block diagram shown in Fig. 16. Examples of such devices 
are optical connectors, attenuators, other passive optical 



devices, modulators, and optical regenerators. In this mea- 
surement, the input stimulus and output response signals 
are intensity modulated light signals. The device under 
test can be a single component or an optical subsystem 
such as an interferometer or sensor. 

If an optical attenuator is selected as the device under 
test, not only can the attenuator loss be measured, but also 
the effective measurement system dynamic range can be 
determined for optical transmission measurements. Fig. 17 
shows such a measurement. This particular system dis- 
plays more than 50 dB of optical dynamic range. 

Optical Reflection Measurements 

The measurement of optical reflections and the identifi- 
cation of their locations are becoming more important in 
gigabit-rate lightwave systems, subsystems, optical sensors. 

(conipnued on page 44) 




Fig. 1 3. Measurement block diagram lor laser reflection sen- 
sitivity measurements. 
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OTDR versus OFDR 



The HP 8702A Lightwave Component Analyze.' system can 
display a test device's transmission and reflection characteristics 
m the modulation frequency domain or the time (distance) do- 
main Because it shows reflections In the time domain a compari- 
son of its capabilities to those of an optical time-domain reflec- 
iometer is often requested 

An optical time-domain refleclometer (OTDR) like the HP 
B145A' measures reflections and losses in optical fibers and 
other devices by sending a probe pulse of optical energy into 
the device and measuring the reflected and backscattered 
energy The HP 8702A Lightwave Component Analyzer de- 
scribed in the accompanying article makes optical reflection 
measurements differently, that is, by transforming reflected fre- 
quency-domain data mathematically into the time domain. Hence 
it can be thought of as an optical frequency-domain reflectometer 
(OFDR) Both measurement systems measure optical reflections 
and lengths and have some overlapping and complementing 
capabilities, bul m general, they are designed for different appli- 
cation areas and therefore have significant differences The 
OTDR is primarily a fiber installation or maintenance tool used 
for installing fiber, checking for faults, and measuring splice loss. 
The HP 8702A OFDR technique is a lab bench tool used for 
component and device characterization or location of reflections. 

The table below summarizes the principal differences between 
the OTDR (HP 8145A Optical Time-Domain Reflectometer) and 
the OFDR (as implemented in the HP 8702A Lightwave Compo- 
nent Analyzer) 



OFDR Mode 
(HP8702A) 



OTDR 
(HP8145A) 

Yes 



Reflection measurement Yes 
(one-port measurement) 

Measures loss 

versus distance (dB/km) No 
(backscatter) 

Measures splice loss and No 
breaks in fiber 

Measures magnitudes and 
positions of Yes 
optical reflections 

Measures oplical return 
loss of reflections Read directly 

Distance range About 40 km 

(4% Fresnel reflection) 

Dead zone None 

Single-evenl resolution <2 mm (2) 
(in optical fiber) 

Two-event resoluiion 3.4 cm (3) 

(m optical fiber) 1 7 cm (4) 

Gates out unwanted Yes 
resonses 

(1) Dead zone depends upon pulse width used in the measure- 
ment 

(2) Assumes that the index of refraction is known accurately and 
does not limit the measurement accuracy. 

(3) Theoretical limil Assumes a 3-GHz frequency span 6 cm 
observed empirically in a nonoptimized experiment. 

(4) Theoretical limit Assumes a 6-GHz frequency span. 2.5 cm 
observed empirically in a nonoptimized experiment. 



Yes 



Yes 



Derivable 
Greater than 
100 km 

Tens of meters (1) 
Meters 

Tens of meters 
No 



Since the OTDR measures backscatter it can locate and mea- 
sure discontinuities that do not produce a reflected signal A 
sudden drop m backscatter level clearly shows a fiber break 
The HP 8702A OFDR technique is not suited to these applica- 
tions, which are generally required for fiber installation 

Conversely, in designing and manufacturing small compo- 
nents, connectors, or fibers, the excellent resolution and stability 
of the HP 8702A OFDR technique make it the best melhod for 
determining the exact locations of closely spaced reflections In 
addition, the gating function can be used to eliminate parasitic 
responses and isolate the effect of a particular reflection 

Both measurement systems perform a one-port reflection mea- 
surement on the optical device under test by injecting an optical 
stimulus signal and detecting the reflected optical signal In the 
case of the OTDR, the iniected stimulus signal is an optical pulse 
or pulse train and the reflected signal consists of the reflected 
(Fresnel) and backscattered (Rayleigh) power 1 

In the case of the HP 8702A OFDR mode, the iniected stimulus 
signal is an amplitude modulated optical signal swept over a 
range of modulation frequencies, and the response is an inter- 
ference pattern which is the summation of individual reflected 
(Fresnel) signals caused by differences In Ihe index of refraction 
ai each interface. The optical return loss versus distance informa- 
tion is generated by performing an inverse Fourier transform on 
the modulation frequency response data. OFDR as implemented 
by the HP 8702A does not detect the backscattered (Rayleigh) 
light, and therefore cannot measure loss versus distance in an 
optical device, such as an optical fiber (See also "Oplical Reflec- 
tion Measurements," page 42.) However, the HP 8702A OFDR 
mode can measure optical reflections at each interface where 
the index of refraction changes and can locate each of Ihese 
individual reflections very accurately. The system also allows the 
direct measurement and display of a reflection's magnitude m 
terms of optical return loss (dB) or reflection factor 

Since the HP 8702A system derives the time/distance informa- 
tion from the frequency-domain data and the system is calibrated 
lo a known reflection (which calibrates the reflection level and 
location of the known reflection), there is no dead zone. In other 
words, reflections can be located from the calibration reference 
plane to many kilometers, depending upon the instrument cali- 
bration states 

The single-event resolution refers to the accuracy with which 
the localion of any given reflection can be located In the case 
of the HP 8702A system, any given reflection can be located 
within 2 mm, assuming that Ihe index of refraction of Ihe medium 
is known to an accuracy that does not limit the measurement 
system accuracy. 

The two-event resolution is the minimum separation at which 
two adjacent reflections can be detected with at least 3 dB be- 
tween their respective peaks and the valley between the peaks. 
The two-event resolution theoretical limit of the HP 8702A system 
is 3 4 cm and 1 7 cm for frequency spans of 3 and 6 GHz, 
respectively Experiments have been conducted lo verify the 
Iwo-event resolution of the HP 8702A system on optical fiber 
samples cut to lengths ol 2.5 and 6 cm Each fiber end face was 
cleaved so that it was perpendicular to Ihe fiber's longitudinal 
axis, yielding an end-face reflection (Fresnel) to air of approxi- 
mately 3.5% of the incident power. 

Reference 

1 M Fleische'-Reumann and F Sischka. "A High-Speed Oplical Time-Domain Rellec 
incneiw wilh Improved Dynamic Range Hewtetl-Packeta Journal, Vol 39. no 6. 
December 1988, pp 6-13 
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CHI START 300 000 MHz STOP 1 500 000 000 MHz 

Fig. 14. Laser transmitter modulation response characteris- 
tics. The lower trace is for 95% ol the transmitted light reflected 
back to the laser. The upper, flatter response was obtained 
with a polarization controller between the transmitter and the 
95% reflector. 

(continued Irom page -12) 

and optical components. The HP 8702A system is well- 
suited to perform optical reflection and length measure- 
ments on a wide variety of components and subsystems. 

Fig. 18 shows the block diagram for measuring optical 
reflections and optical return loss of any optical device 
under test. If a device has more than a single reflection, 
for example reflections of varying magnitudes spaced out 
at different distances in the device, the HP 8702A test sys- 
tem can measure the total reflection or each constituent 
reflection and its respective location. This system can be 
used to measure reflections of fiber optic components or 
bulk optic components when the proper collimaling optics 
are added to the lightwave coupler test port in the test 
system. 

If there are two or more reflections in the device under 
test, the individual reflections will have different phase 
relationships with respect to the measurement reference 
plane at a fixed modulation frequency that will sum to a 
given modulation amplitude and phase. As the modulation 




Fig. 15. Measurement block diagram for optical '-to-electrical 
devices such as photodiodes. 




For Calibration 



Fig. 16. Measurement block diagram lor two-port optical de- 
vices. 

frequency is changed, the phase relationship of each indi- 
vidual reflection will change, depending on its delay time, 
resulting in a different overall modulation amplitude and 
phase ripple pattern. The ripple pattern contains the reflec- 
tion magnitude and location information. By performing 
an inverse Fourier transform on the ripple pattern, a signa- 
ture of the individual reflections can be displayed as a 
function of time (and hence, distance). 

The examples presented here show the reflection mea- 
surement capabilities of HP 8702A Lightwave Component 
Analyzer systems on small components. 

The first example shows that with a lensed optical fiber 
connector added to the block diagram of Fig. 18, reflections 
of optical devices can be measured in an open-beam envi- 
ronment. The devices under test are a glass slide and a flat 
surface gold-plated to form a 95% reflective surface at I '100 
nm. In Fig. 19. the top trace shows the ripple pattern gen- 
erated from the reflections and rereflections from the glass 
slide and the gold wafer. The bottom trace shows the indi- 
vidual reflections and rereflections and their respective 
locations in time (distance). 

The second example shows the reflections in a length of 
fiber that has three internal mirrors fabricated to produce 
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Fig. 17. This attenuator loss measurement shows not only 
the insertion loss of the device, but also a dynamic range of 
at least 50 dB. 
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Fig. 18. Measurement block diagram for measuring optical 
return toss and reflections. 

approximately 2% reflections at three different locations 
in the fiber. This component is typical of devices found in 
various fiber sensor applications. Fig. 20 shows the device 
dimensions and the measurement of the individual optical 
reflections and their respective locations. The absolute lo- 
cation of any of the individual reflections can be measured 
to within a few millimeters, given the correct test condi- 
tions. 
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Fig. 19. (top) A combination of a glass slide and a 95% 
reflector in an open-beam environment, (middle) Ripple pat- 
tern generated by reflections and rerellections in the setup 
at top (bottom) Locations of individual reflections. 




CHI START 0 s 



STOP 60 



Fig. 20. (top) A glass fiber with three internal mirrors produc- 
ing 2% reflections (bottom) A measurement of the three re- 
flections and their locations (Fiber internal mirrors courtesy 
of Texas A&M University EE Department.) 

Fig. 21 shows the optical return loss of the optical launch 
from a laser chip into the fiber identified by marker 3 (11.46 
dB return loss), and the optical return loss of the laser 
module's optical fiber connector at marker 2 (about 37 dB 
return loss). Optical return loss of other optical devices 
and launches such as photodiodes, lenses, and antireflec- 
tion coatings can also be measured easily. 

The widest modulation frequency span limits the 
minimum separation at which two adjacent reflections can 
be resolved, that is. the best two-point resolution. For a 
B-GHz modulation frequency span, the system's theoretical 
two-point resolution is about 1.71 cm in fiber. Fig. 22 shows 
a measurement of two reflections, about 4% each, spaced 
approximately 2 cm apart. The modulation frequency span 
was 6 GHz. Individual reflections can be located within 
less than 2 mm. 

Optical Heterodyne Calibration System 

The transfer function of each lightwave source and re- 
ceiver is measured at the factory and stored on a disc, 
which is shipped with the product. This calibration data 
is loaded into the HP 8702A Lightwave Component 
Analyzer during the measurement calibration process- 
System accuracy is adjusted at the factory using the sim- 
ple but powerful heterodyne or beat frequency technique 
shown in Fig. 23. Light from two highly stable single-line 
lasers is combined to form a single optical test beam. The 
receiver under test effectively filters out the optical fre- 
quency terms and develops a beat frequency response only. 
A frequency sweep is obtained by changing the temperature 
of one of the lasers. 2 
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Fig. 21. Optical return loss ol the optical launch from a laser 
chip (marker 3) and the optical liber connector (marker 2). 

A spectrum analyzer monitors the swept RF output of 
the lightwave receiver under test. Narrowband filtering, 
feasible because of the less than 10-kHz linewidth of the 
lasers, provides an exceptionally repeatable measurement. 

Since the amplitude of the beat signal is a function of 
the polarization of the two laser beams, the system is im- 
plemented in polarization-maintaining fiber. Laser output 
polarization does not change over the modest temperature 
tuning range. A second error source, variation of the laser 
output powers with time and temperature, is eliminated 
by sampling their outputs throughout the measurement 
process and compensating for it. The receiver under test 
is calibrated as an optical average power meter and its bias 
current is monitored to measure variations in the average 
optical powers of the two lasers. 

The resulting system is capable of generating beat fre- 
quencies from dc to over 40 GHz with over 50-dB dynamic 
range. A special reference receiver is calibrated with this 
system and used to calibrate sources and receivers. 

Exceptional laser performance is obtained from Nd:YAG 
ring lasers, CW-pumped by shorter-wavelength diode la- 
sers. Frequency tuning is accomplished by changing the 
temperature of the ring, which is fully contained in a spe- 
cially faceted crystal. 

Measurement Accuracy Considerations 

The HP 8702A system performance depends not only on 
the performance of the individual instruments, but also on 
the measurement system configuration and on user- 
selected operating conditions. The HP 8702A system pro- 
vides a set of measurement calibrations for both transmis- 
sion and reflection measurements. 

The type of calibration depends on the type of device 
and the measurement. For example, if the measurement is 



of optical insertion loss, a frequency response calibration 
would be performed. This calibration removes from the 
measurement the frequency response of the system. This 
is done by connecting a cable between the lightwave source 
and receiver. Once this measurement calibration is stored, 
the DUT can be connected in place of the cable and a 
corrected measurement (i.e., the DUT's optical insertion 
loss) will be displayed. 

In any measurement, sources of uncertainty influence 
the system's measurement accuracy. The major ones are 
optical and RF connector repeatability, reflection sensitiv- 
ity (or noise) of the laser source, directivity of a coupler in 
reflection measurements, and accuracy and repeatability 
of standards and models used in the measurement calibra- 
tions. 

Connector repeatability is a measure of random vari- 
ations encountered in connecting a pair of optical or RF 
connectors. The uncertainty is affected by torque limits, 
axial alignment, cleaning procedures, and connector wear. 
Optical connector repeatability problems can be minimized 
by using precision connectors such as the Diamond * HMS- 
10/HP connector, 3 or by using splices instead of connectors. 

Reflection sensitivity (or noise) refers to the change in 
the behavior of a laser (i.e., its transfer characteristics) when 
reflected light reenters the laser cavity. The effect of the 
reflected light depends on many factors, including the mag- 
nitude, delay, and polarization of the reflected light. Reflec- 
tion sensitivity can be minimized by buffering the laser 
with an optical attenuator or an optical isolator. 

The term directivity refers to how well a directional 
coupler (optical or electrical) directs a signal, or how well 
it separates the incident from the reflected signal. Direct iv- 
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Fig. 22. A measurement of two 4% reflections 2 cm apart. 
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Fig. 23. Heterodyne system used 
to calibrate lightwave receivers in 
production. Calibration data is 
stored on a disc that is shipped 
with the product. 



ity is calculated as the difference between the reverse iso- 
lation of the coupler and the forward coupling factor (e.g., 
if reverse isolation is —50 dB and coupling factor is —3 
dB. then the directivity is - 50 - (-3) = -47 dB). How- 
ever, while the coupler itself may have > - 50 dB directiv- 
ity, the connectors and internal splices may cause reflec- 
tions that may reduce the effective directivity of the pack- 
aged coupler. 

Every measurement calibration uses standards that have 
default models within the HP 0702A instrument firmware 
or have data that is provided externally. Each of these stan- 
dards and models has accuracy and repeatability charac- 
teristics that affect the overall system uncertainty. For 
example, when calibrating for an electrooptical measure- 
ment, the user can enter the transfer characteristic data of 
the lightwave source or receiver into the HP 8702A in two 
ways: by using the factory-supplied 3.5-inch disc or by 



HP B702A Lightwave 




Fig. 24. Measurement block diagram tor a photodiode re- 
ceiver transmission measurement. 



entering the calibration factors printed on a label for the 
source or receiver. The lightwave source or receiver data 
has some accuracy relative to the factory system on which 
each instrument is measured. In addition, use of the 3.5- 
inch disc data offers better model repeatability than the 
calibration factors printed on the label, since the calibration 
factors represent a polynomial fit to the data stored on the 
disc. 
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Fig. 25. Typical ±3<j uncertainty ot the receiver responsivity 
measurement as a function ot modulation frequency The solid 
lines show the maximum and minimum values tor the setup 
of Fig 24. The dashed line is the value for the same setup 
with a low-reflection 10-dB optical attenuator between the 
source and the receiver 
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Fig. 26. HP 8702A guided setup screen lor selecting the 
type ol measurement 

Example: Receiver Responsivity Measurement 

A phototliode receiver with 0.32 A/W or - 10 dB respon- 
sivity. - 14 dB of optical input mismatch, and - 14 dB of 
electrical mismatch was measured by the system shown in 
Fig. 24. The responsivity of the receiver can be read from 
the CRT in dB for any given modulation frequency. 

The uncertainties considered while computing the accu- 
racy of the measurement are as follows: optical match in- 
teraction of the lightwave source and receiver, optical 
match interaction of the lightwave source and DUT, electri- 
cal match interaction of the lightwave receiver and the HP 
8702 A analyzer input, the same uncertainty for the DUT 
and the HP 8702A analyzer input, reflection sensitivity of 
the lightwave laser, dynamic accuracy, lightwave receiver 
accuracy, lightwave receiver model uncertainty, and wave- 
length related uncertainty. 

Fig. 25 shows the uncertainty (dB) of the receiver respon- 
sivity measurement (described above) over an RF modula- 
tion frequency range of 300 kHz to 3 GHz. The solid lines 
represent the maximum and minimum values for the con- 
figuration shown in Fig. 24. The dashed line represents 
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Fig. 28. Screen lorconliguring the measurement hardware. 

the value for the same configuration with a low-reflection 
10-dB optical attenuator between the lightwave source and 
the DUT to reduce the reflection sensitivity of the laser. 

User Interface 

A significant feature of the HP 8702A Lightwave Compo- 
nent Analyzer is the guided setup user interface. It consists 
of a series of softkey menus, instructions, and graphical 
displays to assist the user in configuring measurement 
hardware and in setting basic instrument parameters. 
Guided setup is one part of the user interface. The user is 
assisted in making fundamental measurements without fac- 
ing the entire set of advanced instrument features. 

The HP 8702A uses RF and microwave network analysis 
techniques for making various lightwave measurements. 
At the beginning of the project it was felt that many of the 
potential HP 8702A users would be unfamiliar with tradi- 
tional HP network analyzers. A major goal of the project 
was to develop a user interface that would be easy to use, 
particularly for those with no network analyzer experience. 

Guided setup provides a subset of the HP 8702A feature 
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Fig. 27. Screen lor selecting the type ol device 



Fig. 29. Noise lloor trace for a 3-GHz system lor optical trans- 
mission measurements. 
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Fig. 30. Noise floor trace lor a 3-GHz system lor optical rellec- 
tion measurements in the Irequency domain 

set. The user is given only the choices needed to set up a 
basic measurement. The commands are accompanied by 
textual and graphical instructions in a logical sequence. 

When the analyzer is first turned on. the user is given 
instructions on choosing either guided setup or normal 
unguided instrument operation. At any lime after selecting 
normal operation, the user can start guided setup through 
one of the regular softkey menus. Conversely, the user can 
exit guided setup and go to normal instrument operation 
at any time. 

Guided setup consists of a series of screens that assist 
the user in configuring a measurement and setting basic 
instrument parameters. Each screen consists of a softkey 
menu, instructions, and a graphical display. The screens 
are ordered to teach the general measurement sequence 
recommended in the User's Guide. Each screen contains 
an operation lo be performed or parameters to be set. The 
user progresses through guided setup by pressing the CON- 
TINUE softkey. If existing values and/or instrument states 



Fig. 32. Typical 6-GHz system Irequency-domam noise per- 
formance lor optical transmission measurements. 

are satisfactory, the user can proceed without making 
changes by pressing CONTINUE. To return to a previous 
screen, the user presses the PRIOR MENU softkey. 

Guided setup has the general sequence: select type of 
measurement (Fig. 26), select type of device (Fig. 27), con- 
figure measurement hardware (Fig. 28). set instrument pa- 
rameters, calibrate measurement, set measurement format 
and scale, print or plot measurement, and save instrument 
state in an internal register. 

Guided setup is structured so that action is focused on 
the display and softkey menus. The user is not required to 
use the labeled keys on the front panel with the exception 
of the entry keys. Instrument parameter values are entered 
using the numbered keys, the up/down arrow keys, or the 
knob. Values are entered in normal operation with the same 
method. 

System Performance 

Typical measurement system performance is dependent 
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Fig. 31. Etlective 3 GHz system time-domain noise perfor- 
mance lor optical reflection measurements 



Fig. 33. Typical 6-GHz system Irequency-domain noise per- 
formance for optical reflection measurements. 
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upon the lightwave source and receiver used with the HP 
8702A Lightwave Component Analyzer. In addition, the 
system dynamic range and noise floor performance are de- 
pendent on the calibration routine selected (e.g., response 
or response/isolation calibration) and the signal processing 
features used (e.g., IF bandwidth, signal averaging, signal 
smoothing). 

The system dynamic range is defined as the difference 
between the largest signal measured, usually given by a 
reference level of 0 dB, and a signal 3 dB above the system 
noise floor, as measured in the frequency domain. Besides 
the HP 8702A Lightwave Component Analyzer, the 3-GHz 
system includes an HP 83400A (1300 nm. 3 GHz. single- 
mode 9/125-jim fiber), HP 83401A (1300 nm. 3 GHz. multi- 
mode 50/125-/im fiber), or HP 83403A (1550 nm. 3 GHz. 
single-mode 9/125-ftm fiber) Lightwave Source, an HP 
83410B Lightwave Receiver, and an HP 11889A RF Inter- 
face Kit. The 6-GHz system includes an HP 83402A (1300 
nm, 6 GHz. single-mode 9/1 25-/im fiber) Lightwave Source, 
an HP 83411 A Lightwave Receiver, an HP 85047A 6-GHz 
S-Parameter Test Set. and an HP 8702A Lightwave Compo- 
nent Analyzer Option 006 (6-GHz capability). For reflection 
measurements, the addition of a lightwave directional 
coupler is required in the measurement block diagram as 
shown in Fig. 18. Depending upon the optical fiber size, 
either an HP 11890A (single-mode 9/125-/xm fiber) or an 
HP 11891A (multimode 50/125-/xm fiber) Lightwave 
Coupler should be used. 

To determine the system dynamic range, the system noise 
floor must be determined for the measurement. For the 
3-GHz system, typical noise floor performance is shown in 
Figs. 29, 30, and 31. Fig. 29 shows an averaged noise floor 
trace (ave = 16) for optical transmission measurements; it 
varies from - 55 dB at low frequencies to - 50 dB at 3 
GHz, which yields a 47-dB dynamic range. Fig. 30 shows 
an averaged noise floor trace (ave =16) for optical reflec- 
tion measurement in the frequency domain; it varies from 
- 47 dB to - 43 dB. This noise floor yields a 40-dB dynamic 
range in the frequency domain. Fig. 31 shows the effective 
system noise floor for an optical reflection measurement 
viewed in the time domain. It is derived by performing an 
inverse Fourier transform on the optical reflection noise 
floor data in the frequency domain shown in Fig. 30. The 
effect of the inverse Fourier transform on the frequency- 
domain data is to increase the measurement dynamic range 
in the time domain. Fig. 31 shows a 12-dB improvement 
in dynamic range or a noise floor of -55 dB in the time/ 
distance domain. 

For the 6-GHz system, typical frequency-domain noise 
performance for optical transmission and reflection mea- 
surements is shown in Figs. 32 and 33. respectively. Typ- 
ical time-domain or distance-domain noise performance 
for optical reflection measurements derived from fre- 
quency-domain data (Fig. 33) is shown in Fig. 34. In Fig. 
32, the noise trace was averaged sixteen times and shows 
a — 38-dB worst-case point, which corresponds to a 
dynamic range of 35 dB. Fig. 33 shows an averaged (ave 
= 16) noise floor performance of -30 dB for optical reflec- 
tion measurements obtained in the frequency domain; this 
corresponds to a usable dynamic range of 27 dB. typically. 
For optical reflection measurements in the time or distance 



domain, the averaged noise floor is reduced to -41 dB, 
which corresponds to a dynamic range of 38 dB, typically. 

Table II summarizes the typical system dynamic range 
for each combination of lightwave source and receiver in 
the HP 83400 family when used with the HP 8702A Light- 
wave Component Analyzer. 



Table II 

Typical System Dynamic Range 



Electrical |1] or Electro- 
optical |2| 

(Frequency Domain) 



3-GHz 6-GHz 
System System 



100 dB 80 dB 



Optical [3 1 

Transmission (Frequency Domain) 47 dB 37 dB 

Reflection (Frequency Domain) 40 dB 27 dB 

Reflection (Time Domain) 52 dB 38 dB 

1. Electrical-to-electrical device: 

dB = loiogiryp,) - 2oiog(v 2 /v 1 ) 

where: P, = RF power available at port 1. 
P 2 = RF power available at port 2, 
V, = RF voltage at port 1, 
V 2 = RF voltage at port 2 

(50 11 impedance system). 

2. Electrical-lo-optic.al device: 

dB = 20log((r 5 )/(lW/A)| 

where: r s = slope responsivity of the electrical-to- 
optical device. 

Optical-to-electrical device: 

dB = 20log((r r )/(lAAV)) 

where: r, = slope responsivity of the optical-to- 
electrical device. 

3. Optical device: 

dB - 10log(P,/P 2 ) 

where: P, = optical power at port 1. 
P 2 = optical power at port 2. 
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Fig. 34. Typical 6-GHz lime-domain noise pen'ormance lor 
optical reflection measurements. 
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Design and Operation of High-Frequency 
Lightwave Sources and Receivers 

These compact, rugged modules are essential components 
of HP 8702A Lightwave Component Analyzer Systems. 

by Robert D. Albin, Kent W. Leyde, Rollin F. Rawson, and Kenneth W. Shaughnessy 



FOR HIGH-FREQUENCY FIBER OPTIC MEASURE- 
MENTS, calibrated transitions are needed from elec- 
trical signals to optical signals and back again. In 
HP H702A Lightwave Component Analyzer systems, these 
transitions are provided by the HP 83400 family of light- 
wave sources and receivers, which are designed for easy 
integration into HP 8702A measurement systems. Power 
supply connections. RF connections, signal levels, and 
calibration data are all designed for direct compatibility 
with the HP 8702A, which is the signal processing unit in 
the system. 

To date, four lightwave sources and two lightwave receiv- 
ers have been released. They are: 

■ HP 83400A Lightwave Source— 1300 nm, 3-CHz modu- 
lation, single-mode 9/125-£im fiber 

■ HP 83401 A Lightwave Source— 1300 nm, 3-CHz modu- 
ulation. multimode 50/125-/im fiber 

■ HP 83402A Lightwave Source — 1300 nm, 6-GHz modu- 
lation, single-mode 9/125-/xm fiber 

■ HP 83403A Lightwave Source— 1550 nm, 3-GHz modu- 
lation, single-mode 9/125-/im fiber 

■ HP 83410B Lightwave Receiver— 1300 or 1550 nm, 
3-GHz modulation, multimode 62.5/1 25-/xm fiber 

■ HP 83411A Lightwave Receiver— 1300 or 1550 nm. 
B-CHz modulation, single-mode 9/125-^.m fiber. 

Source Design and Operation 

The signal path through each source starts at the rear- 
panel RF connector and proceeds through a matching cir- 
cuit, an RF attenuator. The attenuator output is transformed 
into a modulated light signal by a laser diode. The optical 
laser output signal is transmitted through a short piece of 
optical fiber to the front-panel connector (see Fig. 1). 

Power for the source is supplied from the probe power 
jacks on the front panel of the HP 8702A. Bias current 
requirements of the internal components exceed the 400 
mA available from this 15V supply, so each source includes 
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a dc-to-dc converter, which changes the supply to 3V, 1A 
for the thermoelectric heat pump 

It was decided to use a laser diode rather than an LED 
as the source element to take advantage of laser diodes' 
high modulation-rate capability, high power, and narrow 
optical spectrum. Lasers are, however, fairly sensitive to 
temperature variations. Output powerand wavelength vary 
as a function of temperature. The lifetime of the laser is 
affected by the temperature of the environment as well. 
The degradation of operating life as the diode junction tem- 
perature is increased is shown in Fig. 2. 

To help minimize these temperature effects, a thermal 
control loop is used to regulate the temperature of the laser 
to a constant 20°C. 20°C was chosen to give optimum laser 
lifetime and temperature regulation range. The thermoelec- 
tric heat pump has a range of cooling of approximately 
40°C. The thermal loop maintains the temperature of 20°C 
within 0.1°C over the specified environmental temperature 
range of 0"C to 55°C. 

The laser diode and a temperature sensor are both 
mounted on the surface of the thermoelectric heat pump. 
A voltage proportional to the temperature of this surface 
is generated by the sensor and external circuitry and then 
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Fig. 1. Lightwave source block diagram 



Fig. 2. Laser diode failure rale as a function of /unction tem- 
perature 
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applied to an integrator as an error signal. The integrator 
output serves as a control signal for the 70-kHz pulse width 
modulated current control circuit. The output of this circuit 
goes to an H-bridge (see Fig. 3) which directs current 
through the thermoelectric heat pump in the proper sense 
to either heat or cool the laser. 

The laser operating point is set by another control loop 
(see Fig. 4) consisting of a photodiode. an amplifier, and 
the laser bias current source. 

The laser diode chip has a front facet and a back facet 
from which light is emitted. The front-facet light is coupled 
into the fiber and goes to the front-panel connector. The 
back-facet light is coupled into a photodiode to generate a 
current proportional to the emitted light. The bias control 
circuit receives this current and generates an error voltage, 
which controls the laser bias current source. The control 
loop's bandwidth is limited to well below the applied RF 
frequencies. 

It is not desirable for the modulating signal to drive the 
laser current to its threshold value, since this would cause 
clipping of the optical signal. Ithnshoid ' s th e current at 
which the laser diode begins the lasing operation, that is. 
when the laser bias current is large enough to produce a 
gain that exceeds the losses in the laser cavity. 

The dc transfer function of the laser diode is shown in 
Fig. 5. At very high diode current, a droop in laser output 
may occur. This phenomenon is known as a kink. If the 
laser current is allowed to swing into this region, distortion 
of the modulation will occur. Therefore, the laser diode 
operation point is bounded by Iu, r( ishoiii on ' ne ' HW en( J an -d 
the kink region on the high end. 

RF modulation is applied to the laser through a dc block- 
ing capacitor and an RF attenuator. An impedance match- 
ing network is included to match the 5012 input lo the laser 
impedance (Fig. 6). Some adjustment of the laser transfer 
function is accomplished by varying the RF attenuator to 
match the RF swing to the individual laser. 

The impedance matching network matches the low-im- 
pedance laser diode to 5011. A variable capacitor is in- 
cluded in the matching network to flatten the modulation 
frequency response of the laser. Adjustment of this capaci- 
tor results in a typical frequency response flatness of ±1 dB 
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Fig. 4. Laser diode bias control loop 
to 3 GHz. 

The source microcircuit package is a straightforward 
deep-well design. The laser package is retained in the mi- 
crocircuit package by two wedging clamps which force it 
against x-axis and y-axis datum surfaces while keeping its 
bottom surface pressed against the housing floor. This ap- 
proach was chosen to locate the laser precisely relative to 
the sapphire microcircuit while ensuring adequate heat 
sinking for the laser's internal thermoelectric heat pump. 
The thin-film microstrip circuit provides the RF interface 
between an SMA connector and the RF terminals of the 
laser package. An epoxy-glass printed circuit board inter- 
connects the dc terminals of the laser with the filtered 
feedthroughs of the microcircuit package (Fig. 7). 

Receiver Design and Operation 

The signal path through the receiver starts at the front- 
panel optical connector (see Fig. 8). Once the modulated 
optical signal is inside the receiver module, it travels 
through a short input fiber to the optical launch, where it 
is coupled to a pin photodiode chip. The output of the 
photodiode is an alternating current at the same frequency 
as the modulation. This signal is amplified by a transimped- 
ance amplifier. The output of the amplifier is routed to the 
back panel of the recei ver by a short length of coaxial cable. 

While simple in concept, the optical launch is difficult 
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Fig. 3. The thermal control loop 
keeps the laser diode within 0 1 
OI20V. 
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Fig. 5. 77?e allowable laser operating region is between the 
threshold current and the kink region. 

to realize because of the dimensions and parameters of the 
components involved. The most obvious approach, launch- 
ing the light directly from the fiber end. was tried first. 
This approach was abandoned because of poor and incon- 
sistent performance, fragility, and difficult assembly. The 
final design, which offers numerous advantanges. is shown 
in Fig. 9. 

A graded-index (GRIN) lens is the primary optical ele- 
ment in the launch. Whereas normal lenses (e.g.. planocon- 
vex) use material with a constant index of refraction mate- 
rial and curved surfaces to refract light, graded-index lenses 
have flat end faces and refract light by virtue of their qua- 
dralically varying internal index of refraction. 

The path of a light ray in a graded-index lens is sinusoi- 
dal. The length of the path is used to describe the funda- 
mental parameter of the lens, known as pitch. If a light ray 
traces a sinusoid of 180 degrees, the lens is said to be 
half-pitch. If the ray traces a sinusoid of 90 degrees, the 
lens is said to be quarter-pilch, and so on. 

The lens used in the lightwave receiver optical launch 
is just slightly less than half-pitch. Light enters the lens 
from the input fiber. It then diverges along sinusoidal paths. 
About halfway through the lens the light beam is collimated 
and is at its maximum width. Past this point the beam 
starts to converge. Just before the beam converges, it exits 




Fig. 7. Mounting ot the laser in the source microcircuit 



the lens. After traveling a short distance through air, the 
beam converges and forms an inverted image of the input 
fiber on the face of the photodiode. 

The GRIN lens is mounted in a machined ceramic cap. 
The ceramic cap was chosen to minimize the effect on the 
electrical performance of the microstrip thin-film circuit. 
The ceramic material used provides a thermal match to the 
sapphire circuit. This is important since the cap is solidly 
attached to the circuit. 

Alignment of the optical launch and photodiode is crit- 
ical if all of the incident light is to impinge on the small 
active area of the photodiode detector. Misalignments on 
the order of a few micrometers can result in substantial 
signal loss. Achieving this alignment solely through me- 
chanical precision would have been difficult and expen- 
sive. Instead, alignment accuracy is achieved by using an 
interactive technique, as shown in Fig. 10. 

The interactive alignment technique works as follows. 
First, the receiver microcircuit is placed in a test fixture 
and power is applied. An optical launch assembly consist- 
ing ot a GRIN lens mounted in a ceramic cap is then coarsely 
aligned with the photodiode. Modulated optical power is 
applied to the GRIN lens by the test system lightwave 
source. Next, micropositioners are used to adjust the posi- 
tion of the optical launch while the output of the receiver 
microcircuit is monitored by the HP 8702A. When the po- 
sition of the launch assembly has been optimized for power 
output, the assembly is fastened to the sapphire thin-film 
circuit. 

The conversion from optical signal to electrical signal 
takes place at the pin photodiode. The photodiode is of a 
proprietary design, optimized for this application. Some 
of its requirements are that it respond strongly to light in 
the wavelengths of interest, have a flat frequency response 
that is uniform across its entire active area, have an active 
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Fig. 6. Input circuit ol the lightwave source. 



Fig. 8. Lightwave receiver block diagram. 
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Fig. 9. Optical launch in the lightwave receiver 

area large enough so thai all of the incident optical signal 
can be focused onto it. be linear over a wide range of input 
power, and possess good antireflection properties to keep 
reflected optical signals to a minimum. 

The pin photodiode gets its name from its structure. The 
top layer is p-type semiconductor, the middle layer is i-type 
or intrinsic semiconductor, and the bottom layer is n-type 
material. Photons enter the photodiode through the top 
layer. The bandgap of the material is such that it appears 
transparent to the photons and they pass right through. An 
electrical signal is generated when photons are absorbed 
in the i layer of a reverse-biased photodiode. creating an 
electron-hole pair. A strong electric field then sweeps out 
the carriers, creating a current that is amplified and de- 
tected in the HP 8702A Lightwave Component Analyzer. 

Once the signal has been generated by the pin photo- 
diode. it must be transferred into the measurement system. 
As is typical in high-frequency applications, the system 
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Fig. 10. Alignment of the optical launch 

uses 50-ohm terminations and coaxial cables. 

Output impedance is one of the receiver parameters op- 
timized to facilitate system integration. In a system where 
termination impedances are not well-controlled, standing 
waves may result. Careful control of the receiver output 
impedance and the well-controlled input impedance of the 
HP 8702 A minimize these standing waves and the measure- 
ment errors they can cause. 

The HP83410B Lightwave Receiver also includes a trans- 
impedance amplifier to increase signal strength. Specifica- 
tions for the amplifier are derived from HP 8702A system 
requirements. The fundamental specification of the 
amplifier is gain. A value for gain is arrived at by consid- 
ering the output noise of the amplifier and the sensitivity 
of the HP 8702A's receiver. To realize the best system per- 
formance with the least expense and complexity, the 
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High-Speed PIN Infrared Photodetectors for HP Lightwave Receivers 




The HP 83400 family ol lightwave receivers uses customized 
InP/lnGaAs/lnP pin photodetectors. The pin detector works by 
convening incoming optical energy into an electrical current. 
Light ot wavelengths 1 2 to 1 .6 fim passes through the transpar- 
ent InP p layer The photons are absorbed in the InGaAs i region 
creating an electron/hole pair. The device is fabricated on an 
n-type conductive InP substrate. The device is operated in re- 
verse bias and the electric field sweeps out carriers, creating a 
current. 

A cross section of the device is seen m Fig. 1 The detector 
epitaxial layers are grown using organometallic vapor phase 
epitaxy (OMVPE). The mesa structure provides a low-capaci- 
tance device for high-frequency applications 

Receiver performance is determined by device dark current, 
responsivity. frequency response, capacitance, and optical re- 
flections The photodetector needs a low dark current, which is 
a measure of the leakage current of the device under reverse 
bias and no illumination. High dark currents may translate into 
noise in the lightwave reciever Dark currents for these devices 
are <30 nA at -5V. The high-frequency operation is determined 
by a combination of the RC time constant of Ihe photodelector 
and the transit time for carriers through the i layer (InGaAs region). 
Capacitance should be low and transit times short. These two 
parameters are interconnected. If the i layer is thin for short transit 
times, the capacitance increases. The design must be optimized 
with both in mind. Two such designs are used. In the HP 8341 OB 
Lightwave Receiver, operation to 3 GHz is achieved, and in the 
HP 83411 A Lightwave Receiver, 6 GHz is obtained The fre- 
quency response must also be flat across the device's active 
region. 
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Fig. 1. Photodelector diode top view and cross sect/on 




Fig. 2. Photograph ot photodetector chip. 

Responsivity is a measure of diode sensitivity It is the ratio of 
photocurrent (l„) output to absorbed optical power input- 

' = Ip'Popucal' 

It is important to have high responsivity to absorb as many 
incoming photons as possible and convert them into photocur- 
rent Typical responsivity values are 0.9A7W at -5V for incoming 
light wavelengths of both 1330 nm and 1550 nm. 

Low optical reflections are important in a lightwave system to 
avoid feedback to the laser light source To achieve the highest 
quantum efficiency, carriers need to pass through the top InP 
layer and not be reflected at the top diode surface An antireflec- 
tion coating is used to acheive <2% reflection from the diode 
surface for both 1300 nm and 1550 nm wavelengths of incoming 
light. 

The devices have been tested for iong-term reliability by exam- 
ining the mean time to failure under high stress conditions of 
175°C and -5V The high-temperature operating life tests show 
lifetimes greater than 3 x 10 5 hours at 55°C instrument operating 
temperature. 

Fig 2 shows a photograph of a photodetector chip containing 
three devices. It shows the metal contact ring, active area with 
antireflection coating, and device bond pad. 

Susan Sloan 

Development Engineer 
Microwave Technology Division 



amplifier has just enough gain so that its output noise is 
approximately equal to the input sensitivity of the HP 
8702 A receiver. Any more amplification and the sensitivity 
of the HP 8702A receiver would be wasted: any less and 
the system sensitivity would drop. 

The amplifier is realized using thin-film circuit construc- 
tion for optimum wideband frequency response. Silicon 
bipolar transistors are used instead of GaAs FETs to 



minimize 1/f noise. 

Mechanical Considerations 

It was felt that small, rugged modules would offer signif- 
icant advantages to the user in ease of positioning relative 
to the DUT and in environments such as light tables, where 
space is at a premium and operation remote from the 
analyzer is required. The die-cast housings offer the right 
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combination of size and shape. When assembled to the 
modules' aluminum center body, or spine, (the modules' 
single structural component), these survived shock and 
vibration levels during strife testing that were ten limes 
the qualification test levels. All components — microcir- 
cuits. printed circuit boards, fiber optic and electrical I/O. 
and cabling — are mounted to the spine, which allows full 
assembly and testing of the modules before they are in- 
stalled in their respective enclosures (Fig. 11). 

The fiber optic connector adapter system developed by 
HP's Boblingen Instrument Division is used on both source 
and receiver modules. Based on the precision Diamond* 
HMS-10'HP fiber optic connector," the design of these 
adapters allows easy access to the ferrule for cleaning, and 
allows the internal HMS-10/HP connector to be mated to 
any of five different connector systems: HMS-1 0/HP, FC/PC. 
ST, biconic. and DIN. A hinged safety shutter is provided 
on the source modules to comply with safety regulations 
in certain localities. 

Assemblability evaluation method* techniques were 
used throughout the development of the source and re- 
ceiver modules. This method was an extremely useful tool 
in exposing hot spots in the mechanical design, areas where 
the number and/or complexity of the steps in an assembly 
operation make it particularly difficult. Perhaps more im- 
portant, it provided a simple structured way of comparing 
designs and making rough estimates of their cost of assem- 
bly. 

Reference 

1. W. Radermacher, "A High-Precision Optical Connector for Op- 
tical Test and Instrumentation." Hewlett-Packard Journal. Vol. 38. 
no. 2, February 1987. pp. 28-30. 

'Assemblabilily evaluation method is a system developed by Hitachi. Lid. and retined by 
General Electnc Company It sets lorlh obieclive criteria for evaluating mechanical designs 
m terms of the number ot parts and the relative dilt'cuity ol 'he operations necessary to 
assemble them 
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Videoscope: A Nonintrusive Test Tool for 
Personal Computers 

The Videoscope system uses signature analysis techniques 
developed for digital troubleshooting to provide a tool that 
allows a tester to create an automated test suite for doing 
performance, compatibility, and regression testing of 
applications running on HP Vectra Personal Computers. 

by Myron R. Turtle and Danny Low 



INTERACTIVE TESTING OF APPLICATION SOFTWARE 
requires the tester to sit at the test system and enter test 
data using a keyboard and/or some other input device 
such as a mouse, and observe the results on the screen to 
determine if the software being tested produces the correct 
results for each set of test data. This process is time-con- 
suming and error-prone if it is done manually each time the 
tester wants to repeat the same set of tests. This process 
must be automated to ensure adequate test coverage and 
improve the productivity of testing. 

Videoscope is a test tool developed and used by HP's 
Personal Computer Group (PCG) for automated perfor- 
mance, compatibility, and regression testing of interactive 
applications running on HP Vectra Personal Computers. It 
is independent of the operating system and nonintrusive. 
Nonintrusive means that it does not interfere with or affect 
the performance and behavior of the application being 
tested or the operating system. Videoscope is for internal 
use and is not available as a product. 

An overview of the operation of Videoscope is illustrated 
in Fig. I, During test creation the tester manually enters 
test data to the application being tested and waits for the 
correct result to show on the display. As the tester enters 
the test data. Videoscope records the data into a file called 
a test script. At the command of the tester Videoscope also 
records the screen result in the test script. The tester con- 
tinues this process for each set of test data and at the end 
of testing the test script contains a sequence of test data 
interspersed with correct screen results. For retesting the 
same application. Videoscope automates the process by 
replacing the tester and playing back the test script to the 
application (Fig. lb). The test data is sent to the application 
as it was entered during test recording. Whenever a screen 
result is encountered in the test script. Videoscope waits 
for it to occur on the display and then automatically does 
a comparison between the current screen and the correct 
screen results in the test script to determine if the test 
passes or fails. 

The concepts and motivation for developing Videoscope 
evolved from experiences with trying provide the best test 
coverage of applications running on the HP Vectra PC. 
When the HP Vectra PC was developed, a major goal of 
the product was that it be compatible with the industry 
standards established by the IBM PC/AT. Compatibility 



was determined by running various applications written 
for the IBM PC/AT and evaluating how well they ran on 
the Vectra. The first iteration of testing was done by hand 
using most of the engineers in the lab. This was clearly an 
inefficient and expensive way to run these tests. The tests 
were then automated using two utility programs. Superkey 
and Sidekick from Borland International Incorporated. 
Superkey was used to capture and play back keystrokes, 
and Sidekick was used to capture screen displays and save 
them to disc files where they were compared with files 
containing known-correct screen displays. These tools and 
certain standards for creating tests were called the regres- 
sion test system (RTS). 

While RTS initially proved adequate, long-term use re- 
vealed weaknesses in the system. First, mouse movements 

Videoscope records 
the lest 




Videoscope 



Fig. 1 . An overview of the operation ot the Videoscope sys- 
tem, (a) Test recording, (b) Test playback. 
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could not be captured or played back, so any program that 
used a mouse had to be tested by hand. Second, the system 
was intrusive. This meant that certain programs did not 
act the same when RTS was loaded. The deviations ranged 
from running differently to not running at all. For example. 
Microsoft* Windows could not run at all with RTS because 
of conflicts over control of the interrupt vectors. Other 
applications could not run because RTS used up so much 
memory that there was not enough left for the application. 
Finally. RTS could not be used to do performance testing 
since it used system resources and affected the performance 
of the system. 

Videoscope was developed to replace RTS and to com- 
pensate for its weaknesses. This resulted in the following 
design objectives for the Videoscope system: 

■ It had to have the same capabilities as RTS. 

■ It had to be nonintrusive. 

■ It had to be able to do controlled-time and real-time 
performance testing. Controlled-time means that fixed 
time delays are inserted in the test scripts to control 
when certain events take place on the system under test. 
Real-time performance testing means the ability to deter- 
mine the actual response time of events taking place on 
the system under test. 

■ It had to be able to handle a mouse and any other pointing 
device that HP sells for the Vectra. 

■ It had to support HP extensions to the PC standard. 

■ Test scripts had to be portable. The intent of this objec- 
tive is to be able to port test scripts to other PC operating 
systems such as Xenix. OS/2, or even HP-UX. It was also 
considered necessary to be able lo use a multitasking 
computer system such as the HP 3000 Computer System 
as a host to test multiple systems on playback. 

■ II had to be able to handle a list of programs (e.g., Micro- 
soft Windows and HP AdvanceWrite) Ihal we needed to 
test but were unable to test with RTS. 

Videoscope System 

The Videoscope system consists of two major parts: a 
program called vscope that resides in a system known as 
the host system, and a board called the Videoscope board 

Microsoft is a registered trademark ot Mico6olt Corporation 
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Fig. 3. Data flow during test script recording. 

that occupies one slot in the PC running the application 
being tested (see Fig. 2). This system is called the system 
under test (SUT). The vscope program is used by the tester 
to create and perform the actual tests. The Videoscope 
board provides the links to the keyboard and pointing de- 
vice (i.e., mouse, tablet, etc.) ports on the SUT. These con- 
nections enable the host keyboard and pointing device to 
be used in place of the SUT keyboard and pointing device 
during test recording. The Videoscope board is also con- 
nected to the video adapter of the SUT. which enables it 
to capture the video signal of the screen contents of the 
SUT. The video signal is used to compute a digital represen- 
tation of the screen. This representation is called a signa- 
ture, and it is the signature that is stored in the test script. 

Although two complete PCs are required for test develop- 
ment, the SUT does not need lo have a keyboard or pointing 
device. For playback, a monitor is optional in the SUT 
since normally no human will need lo look at it. Also for 
playback, it is possible lo use any computer system with 
appropriate software as the host — not just a PC. This satis- 
fies the portability objective. 

Videoscope Software 

The vscope program provides the interface between the 
tester and the recording and playback features of the Vid- 
eoscope system. For recording the tester uses Ihe keyboard 
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Fig. 2. Typical setup lor using the 
Videoscope system with an HP 
Vectra Personal Computer 
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and pointing device on the host and runs the application 
being tested on the SUT (see Fig. 3). The front routine cap- 
tures the test data, which is composed of keystrokes from 
the keyboard and pointing device movements from the HP- 
HIL' (HP Human Interface Link], converts it to symbolic 
names, and passes it on to the parser. The parser performs 
two functions with symbolic names: it saves them in the 
test script, and it translates them into commands for the 
videoscope board. These commands are transmitted over 
the RS-232-D line to the videoscope board on the SUT. 
Periodically, the tester must tell vscope lo capture a screen 
display and save it for comparison during playback. Each 
time a screen is captured a signature is computed by the 
code on the videoscope board and passed back to vscope to 
be saved in the test script. This whole process results in a 
test script composed of two files. One file (test data) con- 
tains keystrokes, pointer device movements and picks, and 
markers for signatures, and the other file (signature file) 
contains the screen signatures. Because of blinking fields 
and cursors on the display, vscope takes several signatures 
for one screen result. A histogram of the signatures is built 
and only the most frequent ones are included in a signature 
list, which goes into the file. 

The syntax of the symbolic names and the algorithm to 
interpret them is based on those commonly used by PC 
keyboard macro programs. 2 "' The reason for this design 
decision was lo maintain the look and feel of the RTS, and 
to reuse existing data stuctures and algorithms. 

Not all keys are recognized by the front routine. Keys that 
do not send data to the keyboard buffer (e.g., CTRL, Alt. 
Shift) are not recognized. These keys, known as "hot keys," 
are commonly used by terminate and stay resident (TSR) 
programs such as Superkey or Sidekick to activate them- 
selves. TSRs are so commonly used in PCs that it was 
decided that vscope had to accommodate their presence on 
the host and the SUT. This created the problem of how to 
send such nondata keys to the SUT. Fortunately, the solu- 
tion to this problem was a natural consequence of the 
method used to encode and decode the symbolic names 
created by vscope. For example, the keystrokes Enter els Enter 
dir /w (cmd)getcrc!cmd) results in a clear screen, a listing of 
filenames in wide format on the SUT, and the capture of 
the resulting screen in the signature file. These keystrokes 
result in the following stream of symbolic names being 
generated by the front routine: 

<ENTER>cls<ENTER>dir /w< ENTER >lcmd|getcrc(cmd) 



order. The pattern <ENTER> will be interpreted as a press 
Enter key entry. Under this scheme, keys that do not gener- 
ate data for the keyboard buffer can be entered by typing 
the symbolic name of the key. For example, the hot key 
combination CTRL Alt, which is used to invoke the TSR 
program Sidekick, can be sent to the SUT by typing on the 
host keyboard <ctrlalt>. Just pressing CTRL and Alt simul- 
taneously would invoke Sidekick on the host. With this 
scheme any key combination tan be Sent lo the SUT and 
nol cause disruption on the host. 

The patlern (cmd| is used lo mark the beginning and end 
of a vscope command. Some vscope commands translate di- 
rectly into commands used by the Videoscope board 
firmware, and other commands are used only by vscope. 
For example, the vscope set command translates directly 
into a Videoscope firmware command to set switches on 
the board. On the other hand, the vscope log command, 
which writes information to a file on the host system, has 
no association with the Videoscope board. Other com- 
mands translate into a complex series of operations. The 
getcrc command is such an example. During test script re- 
cording this command retrieves the screen signature and 
stores it in the signature file. During test script playback, 
il reads the signatures from the test script file, compares 
them with the current screen signature and reports the 
results. 

Keystrokes and mouse movements are sent as quickly as 
possible to the application. A method is provided for slow- 
ing Ihem down to a fixed maximum (time command). Asso- 
ciated with each HP-HIL transaction is a delay which can 
be set from 0 to 32,767 milliseconds. The effect of this 
delay is to limit the maximum speed al which mouse move- 
ments are sent lo the application. Many mouse-oriented 
applications can lose mouse movements if they come too 
fast. Normally this is of no concern when the user is part 
of a closed feedback loop and can reposition the mouse. 
Videoscope is not tolerant of any differences on the display 
that would be caused by missing a mouse movement. By 
experimentally changing the delay value, the test can be 
run at the maximum speed that gives consistent results. Key- 
strokes can be programmed with separate press and release 
delays. Each of these delays can be specified in increments 
of 32 milliseconds over the range of 32 to 8,160 milli- 
seconds.This gives a maximum typing speed of about 180 
words per minute. Allowing these fixed and varying wait 
times to be inserted between keystrokes provides a method 
for modeling user think times for performance measurements. 



This stream is interpreted as follows: 

■ <ENTER> - send press Enter key command 

■ els - send press C, L, and S key commands 

■ <ENTER> - send press Enter key command 

■ dir /w - send press D. I. R. Space, '. and W key commands 

■ <ENTER> - send press Enter key command 

■ {cmd)getcrc(cmd) - Execute getcrc command 

Pressing a key on the host means sending a press key 
command to the videoscope board over the RS-232-D line. 
Under this scheme, an Enter key can be inserted in the test 
script in two ways. The first way is to press the Enter key 
on the host keyboard. The second way is to press the < 
key, E key. N key. T key. E key. R key and > key in that 
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For playback mode the tester runs the vscope program, 
selects the playback option, and specifies the test script 
files to use. In vscope the play routine shown in Fig. 4 reads 
the symbolic, names from the test script and sends them to 
the parser. This time the parser does not create another 
test data file but just translates the data stream and sends 
it to the SUT. Whenever a signature marker is encountered 
in the test data file, the associated signature list is retrieved 
from the signature file and passed to the Videoscope board. 
The Videoscope board will compare the current screen 
signatures with signatures passed from vscope and send 
back a pass or fail indication depending on the outcome 
of the comparison. If a test fails, vscope will either log the 
result and continue testing or halt further testing. This 
decision is based on the options specified by the tester 
when vscope is set up for playback. 

In addition to test recording and playback, vscope pro- 
vides another operating mode called the replay or regener- 
ation mode. Screen signatures are highly dependent on the 
video system in use. Even though the display may look 
exactly the same, signatures from an HP mullimode card 
and a monochrome card are different. If a test developed 
using a multimode card needs to be played back on a 
monochrome card (e.g.. to test whether the software prop- 
erly supports the monochrome card), a new set of signatures 
for the monochrome card needs to be captured. The replay 
mode automates this process by playing back the test data 
file and replacing the old signatures with new signatures 
instead of comparing them as it would in a normal 
playback. A single test data file can access various signature 
files, allowing it to be used with several video hardware 

configurations. 

Videoscope Board 

The Videoscope board is partitioned into two major sec- 
tions: the Videoscope processor and the system under test 
interface (see Fig. 5). The two sections operate indepen- 
dently and are connected by an 8-bit bidirectional port, 
i'hi! processor contains the video signature analyzer (VSA) 
and the keyboard/HP-HIL emulator. The Videoscope board 
is a full-length PC/AT-style accessory board (it can be used 



in a PCXT-size machine if the cover is left off). During 
normal operation the board derives power from the -t 5Vdc. 
-12V'dc. and +12Vdc lines of the SUT backplane. The 
PC AT extended backplane connector is used only to access 
the additional interrupt lines. 

Videoscope Processor. The Videoscope processor is based 
on an Intel 80188 microprocessor. This microprocessor was 
chosen because of its low cost and high level of integration, 
and the fact that it uses the same language development 
tools as the Intel 80286. The 80188 contains built-in timers. 
DMA controllers, an interrupt controller, and peripheral 
select logic. It eliminates the need for at least two 40-pin 
packages and several smaller-scale chips. 

The processor system is equipped with 32K bytes of ROM 
and 8K bytes of RAM. Connected as peripheral devices are 
a UART for datacom. two HP-HIL slave link controllers 
(SLC) for implementing two HP-HIL interfaces, the video 
signature analyzer, several switches, an LED indicator reg- 
ister, a port to emulate the DIN keyboard, and the SUT 
interface. The slave link controllers are HP proprietary 
chips for implementing the HP-HIL protocol. 

The Videoscope processor firmware is written entirely 
in Intel 80188 assembly language. It is modular and all 
main routines are reached through a jump table in RAM. 
A loader function is provided so that a user can write a 
custom module and download it into the processor RAM. 
The jump table can be overwritten so that the downloaded 
module is executed instead of the ROM-resident code. 
There is a command (normally a null operation) that can 
be used as the entry to a user module. The firmware is 
structured as a real-time interrupt-driven system. The code 
normally sits in an idle loop until a command needs to be 
processed or an interrupt serviced. Some of the command 
processing routines themselves introduce new interrupt 
service routines for their operation. 

Communication with the host system is through the RS- 
232-D interface shown in Fig 5. The firmware supports up 
to 9600 baud using a straightforward command-response 
protocol with a simple DC1 handshake. The ACK/ENQ pro- 
tocol used on the HP 3000 is also supported. All data trans- 
fers between the host and Videoscope are in ASCII hexadec- 
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Video Signature Analyzer Operation 



Videoscope uses a method of generating signatures based 
on me HP 5004A Digital Signature Analyzer used for troubleshoot- 
ing digital systems The HP 5004A uses a 16-bit linear feedback 
register to generate a pseudorandom number A signal under 
consideration in a properly operating system is combined with 
this number over a fixed period of time (a specific number of 
clock cycles) to modify it into a unique representation for that 
signal The unique signature is recorded on the schematic dia- 
gram or m a troubleshooting guide This is done for every node 
in the system. If a malfunction develops, signatures taken from 
the system can be compared to those recorded when il was 
operating properly and the fault can be isolated 

The Videoscope signature generator operates in the same 
way but it is implemented differently. The heart of the signature 
generator is a linear feedback register built from three 8-bit shift 
registers {Fig. 1) To get the best resolution with the minimum 
number of parts, a lenglh of 22 bits was chosen This allows the 
register to operate with 2 2Z -1 states The other two bits in the 
register are not in the feedback path and cause the total number 
of states to be multiplied by four. 
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The shift register is controlled by a simple hardware state 
machine (Fig 2) which has a resolution of one dot clock. The 
hardware slate machine is controlled by another state machine 
implemented in firmware, which has a resolution of one scan 
line. The firmware state machine (Fig. 3) is a set of interrupt 
service routines (ISRs). The transition from state to state is by 
an interrupt from either the vertical sync signal (vsync) or a counter 
reading of zero 

The main code (Fig. 4) starts the signaturing process by reset- 
ting the hardware state machine, setting the initial vertical sync 
interrupt service routine state 0, sending a go pulse to the 
hardware state machine, and entering a wait loop. While in this 
loop, other interrupts, such as HP-HIL, can be sen/iced. 

The sequence of states depends on whether the starting line 
on the display is greater than 0% and the stopping line is less 



imal characters. This was chosen to allow complete inde- 
pendence from the host datacom. It also enables test scripts 
be stored in readable and editable form. No compilation 
or decompilation of the scripts is necessary. All commands 
and data from the host parser routine to the Videoscope 
processor are of the form: 

'(command Xlength)(data specific to commandXchecksumiCR DC1 

with all commands beginning with the marker * and all 
data characters, including the (command) and (length) fields 
included in the checksum. The commands recognized by 
the firmware include the following: 

■ Set Attribute ("A). Allows the host software to change 
default settings. 

■ Load Siglist l*C). Implements downloading of signature 
lists for screen matching. 

■ HP-HIL (*H). Sends an HP-HIL device X. Y. and button 
data frame over the HP-HIL interface. 

■ Include (M). Sets the start and stop limits of the area of 
the screen to include in a signature. Used to avoid time 
variant areas of the screen. 

■ Keystroke ("J *K). Sends a keystroke keycode and shift 



modifier. The *J form uses default timing while the "K 
form allows explicit press and release times. 

■ Load CL). Implements downloading of code routines. 

■ Resend ("R). Resends the last data record in case of 
datacom error. 

■ Signature ( - S). Takes a signature of the screen. Used for 
building Ihe signature file for later playback. 

■ Test and Report ("T|. Provides dumps of various sets of 
variables. 

■ Wait for Match (*W). Compares screen signatures until 
either the downloaded list is matched or a time-out 
occurs. 

Responses to the host are of the form: 
(+ -Xoptional dataiCR LF 

where the + indicates successful completion of the com- 
mand and — indicates failure. The optional data varies by 
command. For successful completion the field may contain 
actual data. If it does, it is in a format similar to a command, 
including a length and a checksum. In the case of a failure, 
an error code, followed by an optional verbose error mes- 
sage (enabled by a switch), is reported. 
Video Signature Analyzer. The VSA is the key component 
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of the Videoscope concept. By using the technique 
pioneered by the HP 5004A Digital Signature Analyzer. 4 it 
is possible to monitor the video signals generated by the 
SLT's display adapter in real time in a totally nonintrusive 
manner. The main component of the signature analyzer is 
a 24-bit linear feedback shift register. The linear feedback 
shift register is used to accumulate a signature (similar to 
a cyclic redundancy check) of the video data stream. The 
signature is a 6-digit hexadecimal number that describes 
the state of the screen. The linear feedback shift register is 
a pseudorandom number generator driven by the video 
signal. This means that even a one-pixel difference will 
change the signature. A state machine using the display's 
horizontal and vertical sync signals controls when the sig- 
nature is taken. Since some applications put time variant 
data on the screen such as dates, a clock, or file names and 
paths, a method is provided to allow the signature to be 



Fig. 3. Firmware state machine 

than 100% If the signature is to include the entire screen, the 
firmware state machine is started and stopped by the vertical 
sync interrupt and the state sequence is 0-1-2. If the start line 
is not at 0% then slate 3 is entered and a counter is loaded with 
Ihe proper number of lines to skip When the counter reaches 
0. state B is entered and the signature started. If the stop line is 
not 100%, either state 1 or state B will set up the counter to 
interrupt at Ihe end of the desired area The final slates. 2 and 
A. shut off the hardware, disable any further vsync or counter 
interrupts, and signal the main routine via the done flag The main 
routine then sits in a loop (busy) waiting for the hardware state 
machine to finish and then reads the signature in three 8-bit 
pieces. 

As a fail-safe mechanism, another timer runs while the signa- 
ture is being computed If this timer expires before the signature 
is reported as done, an error is assumed, the entire process 
shuts down, and an error message is issued Several escape 
paths are included in the firmware state machine to ensure lhat 
it won't go into a lockup state 



started and stopped by a count of scan lines after the start 
of the display. In this way only the nonvariant portion of 
the screen will be included in the signature. 

To accommodate the various display adapters used in a 
PC. the video signature analyzer has an eight-input multi- 
plexer which can select from eight separate video streams. 
This allows exhaustive testing of all image planes in a 
multiplane adapter (e.g.. EGA or VGA) and minimizes test- 
ing when the different planes contain redundant informa- 
tion. The tester can use the vscope set command to control 
how many planes are signatured. A separate signature is 
computed for each plane, and when doing a multiplane 
match to a signature list, all enabled planes must match or 
a failure is reported. 

To reduce the part count while maintaining reasonable 
precision and speed, the linear feedback shift register is 
wired as 22-bil maximum length with an additional two 
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bits following. This provides a maximum pattern length of 
4x(2 22 -l) states. Although a typical display has many 
more states than this (a 640 X 350-pixel display has over 
2" A """ distinct states), the probability of having an incor- 
rect display with a valid signature is extremely low. In 
instances where a match of an invalid screen (an alias 
signature) does occur, the next screen signatured will al- 
most certainly fail. Very few alias signatures have been 
detected in actual use. 

The VSA is controlled from the Videsoscope processor 
by a high-speed state machine implemented partially in 
hardware and partially in firmware. The firmware uses a 
state machine consisting of an idle loop and several inter- 
rupt service routines. The present hardware portion of the 
state machine will operate with video dot rates in excess 
of 50 MHz and scan rates in excess of 32 kHz. See the box 
on page 62 for more details about the VSA state machine 
architecture. 

At the completion of the signaturing process the firmware 
can either pass the signature back to the host or compare 
it against a list of valid signatures downloaded from the 
host. The first method is generally used during script cre- 
ation when the vscope program is building the signature 
list file, and the second method is used during test script 
playback. Datacom traffic is kept to a minimum by down- 
loading lists of signatures and doing the comparisons lo- 
cally on the VSA hoard. 

Keyboard/HP-HII. Emulator. Videoscope has the capability 
of emulating the keyboard used on the earlier Vectra models 
as well as the industry standard DIN keyboard used in both 
the PC/XT and PC/AT protocols. The new Vectra models 
use the DIN keyboard and are compatible with the PC/XT 
and PC/AT protocols. For the HP-HIL interface, there are 
two slave controller chips on the board, which are capable 
of emulating any two of the following devices: a mouse, a 
tablet, a touchscreen, or a keyboard. The controllers are 
directly driven by routines in the Videoscope processor 
firmware and no additional processors are needed. Addi- 
tional devices can be emulated by changing the firmware. 
SUT Interface. The SUT interface port provides a com- 
munication path between the Videoscope processor and 
the SUT processor. This is an H-bil bidirectional port with 
provision for generating an interrupt when written to. In 
the SUT this can either be a nonmaskable interrupt (NMI) 
or any one of the interrupt (INTRn) lines on the backplane. 
The desired line is selected by a switch. The SUT must 
have an interrupt handler installed for this feature to be 
used. Also included in the SUT interface are 8K bytes of 
ROM and 8K bytes of RAM. This 16K-byte address space 
can be configured within the C0000-DFFFF address range 
of the SUT. The ROM is seen by the SUT power-up routines 
as an option ROM and normally includes the necessary 
interrupt handler. The I/O ports of the SUT interface can 
be located at any of several addresses normally reserved 
for the IBM prototype card. Both the memory and the I/O 
ports can be relocated to avoid conflicts with hardware 
installed in the SUT. The current implementation of the 
Videoscope system does not make use of the SUT interface. 
However, the hooks are available for users to create routines 
for special-purpose testing requirements. 



Conclusion 

Videoscope has met or exceeded the original objectives 
established for the system. One minor disappointment is 
(hat the the signature generated from the video signal is 
not unique. However, the probability of two screens having 
the same signature is very small, so this is a very minor 
problem and simple workarounds have been found. 

The design and implementation of Videoscope was highly 
leveraged. The video signature analysis hardware and 
firmware are based on the HP 5004A Signature Analyzer. 
The data structures and algorithms for interpreting data in 
the script file are based on those commonly used by key- 
board macro programs, and the data communic.alion soft- 
ware used by vscope to communicate with the Videoscope 
processor firmware is a package called Greenleaf Data 
Comm Library from Greenleaf Software, Inc. 

Videoscope provides a major productivity tool to im- 
prove the quality of software. The PC intrinsically is an 
interactive computer system. This means that batch-style 
tests cannot adequately test the capability of any software 
written for a PC. Videoscope provides an automatic alter- 
native to slow, error-prone, and expensive manual testing. 
Currently there are over 100 Videoscope systems in use at 
18 HP divisions. 
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Michael J. Wright 

I Software engineer Mike 
Wnght pined the learn de- 
I veloping the HP Real-Time 
| Data Base during the early 
design stages in 1987 His 
mam responsibility was the 
interactive query/debug 
software He |omed the 
Manufacturing Productivity 
Division ol HP in 1985, Of- 
fering the experience of some twenty years of pro- 
gramming and systems work in business and man- 
ufacturing systems Mike attended the University 
of Wisconsin, from which he received a masler's 
degree In 1965 He is married and enpys ndmg 
motorcycles 




Cynthia Givens 

Cynthia Givens respon- 
sibilities for the HP RTDB 
proiecl ranged from the in- 
itial investigation, lo inter- 
nal/external design, io test- 
ing and documentafion 
She has since moved on to 
the development of an ap- 
plication integration tool 
Among her pasl sollware 
proiects are !he MMC'1 000 manufacturing applica- 
tion and AGP'DGL graphics packages Cynthia s 
BA degree in computer science is from Ihe Univer- 
sily ol Texas al Austin ( 1 983) Born in Durango. Col- 
orado, she's married and lives in Santa Clara. 
California She enpys hiking, skiing, and camping 




6 ~ Real-Time Data Base Z 



Michael R. Light 

Mike Light pined HP in 
1980, shortly after receiv- 
ing his BS degree in com- 
iler science from Ihe Uni- 
versity of Vermont He con- 
tributed lo the development 
I ol the HP RTDB product as 
, an R&D engineer, and his 

iF'* 1 ' past responsibilities in- 
-2crv elude the Image/1000, 
Image/1000-2 and Image/UX environments Mike 
was born in Panama City. Florida, and lives in San 
Jose. California "Games in any form' is how he de- 
scribes his leisure interests 




Le T. Hong 

I Contributing to all develop- 
ment phases ol the HP 
Real-Time Daia Base pro- 
iecl. Le Hong analysed Ihe 
user requirements, as- 
sisted in scheduling and 
prioritizing, and generally 
acied as the lechnical 
leader for Ihe proiecl She 
has since moved lo lechni- 
cal marketing in HP's Value-Added Channels pro- 
gram In earlier assignment, she has coniribuled 
to the maintenance and enhancement ol Ihe IC-1 0 
miegraled-circuil lot tracking system. Ihe EN- 10 
engineering data colleclion system, and Ihe PCB/ 
3000 pnnted-circuil-board lot tracking system Le's 
BA degree m computer science is Irom Ihe Univer- 
sity of Washington ( 1 983) She was born in Saigon. 
Vietnam, and lives in Fremonl, Calilornia 




Feyzi Fatehi 

I Working on Ihe HP Real- 
Time Data Base lor over 
three years. Feyzi Falehi 
designed and implemented 
the indexing mechanisms 
and coniributed to all 
phases ol developing this 
precision lool He came to 
HP m 1986 after working as 
a plant automaiion en- 
gineer al a Texas power plant Feyzi's BSME 
degree (1982) is from Ihe University of Texas al 
Austin and his master s degree in computer sci- 
ence ( 1 985) is from Soulhwesl Texas Slale Univer- 
sity He's currently studying toward an MBA degree 
al Sanla Clara University He was born in Teheran, 
Iran, lives in Sunnyvale. Canlornia. and serves as 
a Junior Achievemen! advisor al the nearby Moun- 
lam View High School His favorite pastimes in- 
clude tennis, hiking and skiing 
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Ching-Chao Liu 

A software development 
engineer at HP's Industrial 
Applications Center. 
Ching-Chao Liu contri- 
buted his expertise to all 
phases ol the RTDB pro- 
iecl In previous assign- 
ments, he was the technical 
leader of the HP ALLBASE 
DBCORE project, the pro- 
iect leaoer lor the HP-UX MULTIPLAN tool, ana a 
designer ol other software projects He came to HP 
In 1980 Ching-Chao coaulhored two papers tor 
data base conferences and is a member ot the As- 
sociation lor Computing Machinery and of SIG- 
MOD His BS degree m nuclear engineering is from 
the National Tsing Hua University in Taiwan (1972) 
and his MS degree In computer science is from 
Oregon Slate University (1979) He was born in 
Taiwan, is mamed and has two children who 
sparked his special interest in child education. He 
lives in Sunnyvale. California In his leisure time, he 
likes swimming playing bridge, and listening to 
classical music. 
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Thomas 0. Meyer 

Tom Meyer was the pro|ect 
^^^V manager lor the HP 9000 
/ \ Model 835 SPU hardware. 

• StnceheioinedHPm 1977 

" ' n ' s c ' es '9 n P'olecls have in- 

^^H^H eluded a memory board for 

Ihe HP 250 Computer, a 
^^^^V power supply for Ihe HP 
9000 Model 520 Computer. 
^V*^ Ihe battery backup reg- 
ulator lor Ihe HP 9000 Model 825 Computer, and 
proiect management lor Ihe HP 9000 Model 825 
and HP 3000 Series 925 Computers. Tom joined 
HP in 1977. soon after obtaining his BSEE degree 
Irom the South Dakota School ol Mines. He has 
coauthored two previous articles lor Ihe HP Jour- 
nal He was born in Rapid City. South Dakota, and 
lives in Fori Collins. Colorado. His list ol outside 
interests includes sailing and sailboat racing, 
scuba diving, skiing, hiking, and lour-wheel-drive 
vehicles. 



Jeffrey G. Hargis 

Designing the processor- 
dependent hardware and 
conducting environmental 
W^F^Kk testing ol the HP 9000 

Model 835 were Jeff Har- 
gis first major projects after 
pining HP's Systems Tech- 
nology Division in 1987 He 
has smce moved on to Ihe 
design of components for 
new SPUs He attended Ohio State University, 
where he obtained a BSEE degree m 1987 Jeff was 
bom m Athens. Ohio, and is married He lives in Fort 
Collins. Colorado He enjoys playing the piano, 
basketball, and backpacking 



John Keller 

Design of the floating-point 
^^^H^ controller was John Keller's 
m^^^^^^U mam contribution to the HP 

W 9000 Model 835 proiect 
' His list ol past design proj- 

ects includes CMOS pro- 
cesses. RAMs. and circuits 
lor Ihe HP 3000 Series 950, 
925, and 955 and HP 9000 
Models 850, 825, and 855 
Computers. He now designs ICs lor luture com- 
puter products His BSEE degree is Irom the Uni- 
versity ol Wisconsin ( 1 98 1 ) and his MSEE degree 
is Irom the University ol California at Berkeley 
(1 985) He has authored and coauthored a number 
ol papers and articles lor conferences and publica- 
tions John was born in Milwaukee. Wisconsin He 
is a volunteer literacy tutor in Cupertino. California, 
where his lives. In his spare lime, he likes studying 
languages, skiing, and travel 



Floyd E. Moore 

Floyd Moore designed the 
^j^K-^ 16M-byte memory circuitry 
^^^^^L lor the HP 9000 Model 835 

I and worked on the design 

7 I and testing ol the HP 3000 

JJJSM Series 935 system He is 
«^^^E presently working on Ihe 
design ol an SPU lor a 
future HP Precision 
J0^^ ' Architecture system He 
came to HP in 1986. working on a project 
associated wilh the tape-automated bonding 
technique Floyd was bom in Richmond. California 
His bachelor's degree is from the California 
Polytechnic State University at San Luis Obispo 
He is married and lives in Fort Collins. Colorado. 
His favorite pastimes are photography and audio 
engineering 



Russell C. Brockmann 

Most ol Russ Brockmann's 
recent design activities 
j^J^^^^^k have concentrated on Ihe 
H^B^BK processor circuit for the HP 
rTaQJjf 9000 MoOel 835 and HP 
I J 3000 Ser.es 935 Comput- 

ers He also designed the 
battery backup unit used in 
-A ' ne HP 9000 Models 825 

MA and 835 and HP 3000 

Series 925 and 935 Computers He completed de- 
sign ol the Model 825/Series 925 processor circuit. 
Currently, he is developing components lor future 
SPUs He |Oined HP shortly alter obtaining his BSEE 
degree Irom Oregon Stale University In 1 985 He 
also attended Western Bapt'St College In Salem 
(1977-1979) and Lane Community College in 
Eugene (1981-1983). both in Oregon Russ 
teaches Sunday school and serves in a variety ol 
olher church activities in Fort Collins, Colorado, 
where he lives He was born in Myrtle Point. Ore- 
gon, is married, and has three children Fishing, 
camping, playing a 12-slrmg guitar, and bible 
study are some ol his lavorile pastimes. 
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Jeffery J. Kato 

During development of the 
HP 7980XC Tape Drive. 
Jell Kato's contributions fo- 
cused on the architecture 
and design implementation 
lor Ihe data compression 
chip and tirmware design 
He has also designed read 
electonics lor the HP 7978A 
Tape Drive and the PLL and 
PLLIC for the HP 7980A Tape Drive He came lo 
HP in 1982, Ihe same year he received his BSEE 
degree Irom Montana Stale University He is 
named as a coinventor in three pending patents 
describing data compression and blocking tech- 
niques Jeff has coauthored a previous article for 
the HP Journal He is an active member of his 
church and a United Way volunteer He was bom 
in Havre, Montana, is married, and lives in Greeley, 
Colorado. Among his spare-time activities, I le likes 
sknng. basketball. Softball, and camping 

Mark J. Bianchi 

Analog circuit design and 
control systems are Mark's 
principal professional in- 
terests He was the R&D 
engineer lor Ihe design, 
layout, and testing ol the 
daia compression chip lor 
Ihe HP 7980XC On previ- 
ous projects, he designed 
the read channel elec- 
tronics for Ihe HP 9144A and HP 9142A Tape 
Drives Mark received his BSEE degree Irom 
Pennsylvania State University in 1 984. the year he 
also joined HP's Greeley Division Born in Vineiand, 
New Jersey, he lives in Fori Collins, Colorado His 
list of leisure activities includes weightlifling. 
Softball, volleyball, basketball, boardsailing, skiing, 
camping, and photography. 



David J. Van Maren 

Dave Van Maren joined 
HP's Vancouver Division in 
1980. after receiving his 
BSEE degree Irom Ihe Uni- 
versity of Wyoming His re- 
sponsibilities as an R&D 
engineer on Ihe HP 7980XC 
Tape Drive Included the 
data compression and tape 
capacity benchmarks, the 
lape lormat definition and firmware, and the data 
bufler management tirmware In past projects, he 
worked on lormatting VLSI tools for both Ihe HP 
7979A and HP 7980A Tape Drives and on the servo 
lirmware tor the HP 7978A He coauthored an ar- 
ticle lor the HP Journal in 1 983 about the latter proj- 
ect Dave's work on the VLSI FIFO circuit lor Ihe 
lape drives resulted in a patent, and he is named 
coinventor in lour pending patents describing data 
compression and blocking techniques He was 
bom in Casper, Wyoming, is marned. and has three 
young sons He lives m Fort Collins, Colorado He 
and his wile spend much ol their Iree lime teaching 
natural family planning. 




66 HEWLETT PACKARD JOURNAL JUNE 1989 



© Copr. 1949-1998 Hewlett-Packard Co. 



32 Super-Blocking . 



Mark J. Bianchi 

Author's cograEhy appears elsewhere m this 
section 



Jettery J. Kato 

Author's biography appears elsewhere in this 
section 



David J. Van Maren 

Author s Biography appears elsewhere m this 
section 



35 Lightwave Component Analysis 

Michael G. Hart 

■ As development engineer. 
Mike Harl was involved in 
designing firmware lor the 
HP 8702A Lightwave Com- 
ponenl Analyzer and. ear- 
lier, lor the HP 8753A Net- 
woik Analyzer He con- 
tinues to work on similar 
assignment lor new HP 
\ ^K. I products He attended 
Utah State University where he earned his BSEE 
degieem 1983 His MSEE degree is Irom Cornell 
University(1984) He|omedHPin 1984 The light- 
wave component analyzer is the subject ol a paper 
Mike coauthored lor an RF and microwave sym- 
posium. He is a member ol the IEEE He was born 
m Sacramento. California, and in his oil-hours, he 
serves as the organist lor his church In Santa Rosa. 
California, where he lives. Other recrealionai activ- 
ities include playing the piano, soltbali, tennis, and 
travel 




Paul Hernday 



Paul Hernday is an R&D 
proiect manager in HP's 
Network Measurements 
Division in Sanla Rosa 
California With HP since 
1 969. he has been involved 
with new- product develop- 
ments in sweepers scalar 
and vector network analyz- 
ers, and lightwave compo- 



nent analyzes H s most recent project has been 
the cevelopment o' a dual-laser heterodyne systerr 
lof the caiib'aticfi of lightwave receivers Paul 
earneo his BSEE degree at the University of Wis- 
consin hi 1968 He is married has two children 
and lives in Sanla Rosa California. Boarosailing 
muse ano robotics are among r*s owse k*sure 
nteresis 



Geraldme A. Conrad 

t™ As a development enginee* 
on the HP 8702A Lightwave 
Component Analyzer. 
Gerry Conrad worked on 
measurement accuracy 
and system performance 
analysis She continues to 
be involved in similar de- 
velopments more specifi- 
cally m microwave circuit 
design and optical system evaluation In earlier 
years ol her career al HP she worked first as a 
product marketing engineer and later joined a 
des'gn team on (he HP 8753A Network Analyzer 
Gerry originally |omed HP as a summer student In 
1980. then accepted a permanent position two 
years later Her BSEE degree is from the University 
of Florida (1982) She has authored a paper 
describing an RF network analyzer verification 
technique and coauthored a symposium paper 
about high-lrequency measurement ol lightwave 
systems She is a member ol the IEEE. Born in Trm- 
comalee. Sri Lanka. Gerry is married and lives in 
Santa Rosa, California Her leisure interests include 
travel, quilting, camping, hiking, cooking, and 
reading 



Roger W. Wong 

I Lightwave and microwave 
measurement technologies 
are Roger Wong's special 
interests, and as the R&D 
program manager, he car- 
ried overall responsibility 
lor the development of the 
HP 8702A Lighlwave Com- 
ponent Analyzer Past re- 
/ sponsibilities included sca- 
lar network analyzer detectors, directional bridges 
and accessories, and the development ol rnicrocir- 
cmts and associated components lor microwave 
applications Roger |omed Ihe Microwave Division 
of HP in 1 968, alter obtaining his MSEE degree Irom 
Columbia University His BSEE degree is Irom Ore- 
gon State University (1966) He is a member ol Ihe 
IEEE and the National Society lor Prolessional En- 
gineers. He has authored or coauthored a number 
ol papers and articles on microwave transistor 
modeling, microwave amplifier design, and high- 
speed lightwave measurements. Several patents 
describing lightwave measurement techniques 
Roger developed are pending He was born in 
Honolulu, Hawaii, is married , and has a five-year- 
oid son. He lives in Santa Rosa. California His 
lavorite pastimes include swimming, hiking, cook- 
ing, and baking bread. 
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Kenneth W Shaughnessy 

The HP 6753A and the HP 
8754A Vector Network 
Analyzers and a number of 
lightwave instruments are 
among the major prc-rec's 
to which Ken Shaughnessy 
has contnbuted design 
<3eas On the HP 8702A 
Lightwave Component 
Analyzer System, he 
worked as a produc l designer He jomed the Sanla 
Rosa (California) Division of HP in 1 975 as a printed 
Circuit boa'd designer after previous positions as 
a mechanical designer al Sperry Marine Systems 
and Teledyne Avionics Ken attended the Univer- 
sity erf Virginia School of Engineering He was born 
m Chicago. Illinois, is married, and has five chil- 
dren He lives in Kenwood California Woodwork- 
ing and automobile and bicycle repair are his favor- 
ite spare-ume activities. 



Kent W. Leyde 

in development of the HP 
8702A Lightwave Compo- 
nent Analyzer. Kent 
Levde's design work con- 

P. centrated on microcircuits 
__k^fl^ iw andopticslorihelightwave 
-^H^r mem engineer, he has 
^ since started work on the 

signal acquisition and pro- 
cessing elements of a new producl Kent's BSEE 
degree ( 1 984) and MSEE degree ( 1 985) are Irom 
Washington Slate University While attending col- 
lege, he worked lor local companies on such prod- 
uct developments as process controls and digital 
protective relays lor high-vollage ac transmission 
systems. He joined HP in 1 985 He coauthored an 
article describing an optical measurement system 
soon to be published in Optical Engineering Born 
m Seattle, Washington, he is married and has a 
small daughter. He lives In Santa Rosa, California 
In hisolf-hours. he enpys boardsailing and skiing 



Rollin F. Rawson 

The HP 8753A and HP 
i! 754 A Network A' , 
I and Ihe HP 8756A Scalar 
. ^r^^^BL ' .vork Analyzer are 

^H^| among the many product 
^^^^^^P^ developments lo 
A^Bp' ecl Rawson has conlrib- 

»|K / uted His work on the HP 
B / 8 702 A Lightwave Compo- 

neni Analyzer has focused 
on source leveling and the thermal loops, the re- 
ceiver power supplies, the RF attenuator and 
source, and the RF interface He has worked for HP 
Since 1 960 Fred's BSEE degree is from Calilorroa 
Stale University at San Jose Belore enrolling at San 
Jose, he served as a staff sergeant in the U S Air 
Force Born m Laguna Beach. California, he is mar- 
ried and has four children He lives in Santa Rosa, 
California In his leisure time, he en|oys collecting . 
refurbishing, and driving Studebaker automobiles, 
he also collects stamps 
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Robert D. Albin 

I Dale Albm was proiecl 
[manager lor Ihe lightwave 
I sources and receivers dis- 
I cussed m this issue ot me 
I HP Journal In his Iwelve- 
I year career at HP. he has 
I been a production en- 
I gineer al Ihe Microwave 
I Technology Division work 
ling on device testing and 
GaAs FET processing and a development en- 
gineer/project leader on millimeter source modules 
al the Network Measurements Division His BSEE 
degree (1977) is from the University ol Texas at 
Arlington, and his MSEE degree is Irom Slanlord 
University. Two patents relating to finline technol- 
ogy are based on his ideas Dale has delivered 
papers at HP symposia and has written a previous 
HP Journal article about millimeter source mod- 
ules He was born in Dallas. Texas, and lives m 
Santa Rosa, Calilorma. His outside interests in- 
clude running, skiing, bow hunting, reading and 
aviation 
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58 — Videoscope ', 



Myron R. Tuttle 

Betore working on the 
hardware and lirmware de- 
sign lor the Videoscope 
tool. Myron Tuttle's respon- 
sibilities included develop- 
ment ol the HP 4598 1 A Mul- 
timode Video Adapter and 
the HP 2625 and HP 2628 
Terminals. He |oined the 
Advanced Products Divi- 
sion ol HP in 1974 and is a member of both the As- 
sociation for Computing Machinery and the IEEE. 
Myron's BSEE degree is from the University of 
California at Berkeley He served in the U.S. Navy 
as an electronics technician. Born in San Francisco, 
California, he is vice president of a homeowners as- 
sociation m Santa Clara, California, where he 
lives. His hobbies are amateur radio and computer 
programming. 



Danny Low 

Danny Low joined the 
Cupertino Division of HP in 
1972. shortly alter obtain- 
ing a degree In computer 
science from the University 
ol California at Berkeley He 
developed the host sott- 
I ware lor Ihe Videoscope 
I tool and continues to sup- 
port it In the past, his re- 
sponsibilities included soltware quality control for 
the original MPE system He also developed sys- 
tem soltware for the HP 300 and for the MPE- V com- 
puters Danny was born in Canton, China, and lives 
in Mountain View California Hisfavonteoff-hours 
activities focus on computers, science fiction, and 
photography 




J. Barry Shackleford 

Barry Shackleford spent al- 
1, most three years as a de- 

velopment engineer at 
_ M Yokogawa Hewleti-Pack- 

WJ ard in Japan, where he 
^ v worked on a Kanji com- 
^^J^^^ .'"iter terminal His more re- 
^f^^^^M M cenl work in Ihe Systems 
A I Architecture Laboratory ol 

W I HP Laboratories yielded 

the background tor the neural network program- 
ming approach he describes in this issue ol the HP 
Journal Before pining HP in 1981 , he worked for 
Hughes Aircraft Corporation, designing a telemelry 
computer lor the still-functioning Pioneer Venus 
spacecraft, and for Amdahl Corporation, develop- 
ing hardware for Models 470 and 580 mainlrame 
computers Several pending patents are based on 
his ideas Barry's BSEE degree is from Auburn Uni- 
versity (1971). and his MSEE degree is from the 
University ol Southern California (1975) He is a 
member of the IEEE He was born In Atlanta. Geor- 
gia, and lives in Sunnyvale, California He speaks 
Japanese and practices Japanese brush writing as 
a hobby He has a pilot license and likes large-for- 
mat photography, woodworking, and hiking 



79 — Electromigration Model 



Vladimir Naroditsky 

As a professor of mathema- 
tics at California State Uni- 
versity at San Jose, Vladi- 
mir Naroditsky contributed 
his expertise in the elec- 
iromigration simulation pro- 
lect described in this issue 
ol Ihe HP Journal He emi- 
grated from the Soviet 
, . Union in 1979. and his 
bachelor's degree is from Kiev University (1976) 
His PhD degree is from the University of Denver 
(1982) Vladimir has authored 24 papers in Ihe 
field of mathematical physics, his professional 
specialty He is a member of the American Mathe- 
matical Society, the Malhematics Association ol 
America, and the Society of Industrial and Applied 
Mathematics He was born in Kiev, is married, and 
lives in San Francisco. California In his leisure lime, 
he enjoys classical music 




Wulf D Rehder 




Wuif Rehder. who de- 
scribes his place ol origin 
as "a tiny village m Northern 
Germany." pursued 
studies at universities in 
Hamburg. Freiburg, Tokyo, 
Berkeley, and finally Berlin, 
where he earned his PhD 
degree in 1978 Ancient 
anguages, mathematics, 




and physics are among his sub|ects ol study, and 
he has held various teaching positions, most re- 
cently as prolessor of mathematics at California 
State University at San Jose He was a statistician 
al HP's System Technology Division until last 
December, when he became systems perlor- 
mance manager at Meiaphor Computer Systems 
in Mountain View, Calrlornia Wull is a prolific writer 
and has published some 35 papers on mathema- 
tics, 8lQttitiC8, philosophy, and linguistics He's 
working on his third book. He is married, has two 
children, and lives m Santa Clara. Calilorma His 
hobbies include the study ol the middle ages, 
especially the 1 1 th cenlury. He also specializes m 
early I9th-cenlury literature 



Paul J. Marcoux 

I Paul Marcoux is a project 
manager for studies involv- 
inq failure analysis and fail- 
ure physics for integrated 
circuits In this issue ol Ihe 
HP Journal, he reports on 
a new model lor simulating 
electromigrallon in thin 
melal lilms. Aspects ol in- 
tegrated circuit process 
lechnology have been local to most of his past pro- 
lects at HP He has written about 20 articles about 
chemistry and IC processing lor prolessional jour- 
nals.andhelsa member of the American Vacuum 
Society A patent has been awarded for an IC pro- 
cess he helped develop Paul's BS degree is from 
Villanova University (1970). and his PhD degree in 
chemistry is from Kansas State University (1975) 
He did posldocloral work in chemical kinetics at 
Pennsylvania State University Born in Pautucket. 
Rhode Island, Paul is married and has Iwo 
daughters. He lives in Mountain View. California, 
and his favorite pastime is photography 



Paul P. Merchant 

As a proiecl leader at HP 
Laboratories. Paul 
Merchant has had a variety 
ol RSD proiects involving 
electromigration, silicide 
process development, and 
multilevel metallization He 
handled modeling, testing, 
and planning m the elec- 
tromigration study dis- 
cussed in this issue Processing and properties of 
thin lilms are his specialty He has published many 
papers on solid-state physics and chemistry and 
on microelectronics and is a member of both the 
American Physical Society and the American Vac- 
uum Society Paul's BS degree in physics is from 
the University of Vermont (1 972) and his ScM de- 
gree (1 974) and his PhD degree In physics (1 978) 
are both Irom Brown University Two postdoctoral 
positions, one in France and one at Brown Univer- 
sity, involved laser materials and photoeieclroiysis 
He is married, has Ihree sons, and lives m Memo 
Park. California In his off-hours, he enioys playing 
the piano, astronomy, and bicycling 
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Neural Data Structures: Programming 
with Neurons 



Networks of neurons can quickly find good solutions to 
many optimization problems . Looking at such problems in 
terms of certain neural data structures makes programming 
neural networks natural and intuitive. 

by J. Barry Shackleford 



A FEW YEARS AGO al HP Laboratories we were 
privileged to have John J. Hopfield. professor with 
the Divisions of Chemistry and Biology at the Cali- 
fornia Institute of Technology, give us a talk on computing 
with neurons. The lecture was fascinating. 1 was particu- 
larly excited by the fact that networks of neurons could 
quickly find good solutions to optimization problems like 
the traveling salesman problem. I had once written a com- 
puter program to route the interconnections on a computer 
circuit prototyping board (Fig. 1). Thus. I was painfully 
aware that simply proceeding to the closest unconnected 
point can lead to disasters when the objective is to 
minimize the total length of wire connecting a set of termi- 
nals. 

At the end of the talk I still could not see how to program 
these networks. Being an engineer. I wanted to ask, "How 
do you determine the resistor values and interconnections 
for the traveling salesman problem?" However. I was reluc- 
tant to ask such a prosaic question in light of the arcane 
questions I was hearing concerning differential equations 
and energy surfaces. 

I assumed that I could get the answer from my colleagues, 
SO the question went unasked. The answer was not forth- 
coming, however. No one that 1 asked had come away with 
the insight of how to construct such a network. 

After a week of intensive study of some of Hopfield's 
published work with HP Laboratories colleague Thomas 
Malzbender. progress was made. Tom's mathematical in- 
sight and my desire for some type of higher-level represen- 
tation of what was happening at the differential equation 
level produced the approach presented here. I knew I was 
on the right track when a week later. I solved the eight 
queens problem using the neural data structure models 
that had emerged the week before. 

Introduction 

By constructing networks of extremely simple nonlinear 
summation devices — now termed "neurons" — complex 
optimization problems can be solved in what amounts to 
be a few neural time constants. This is a collective, consen- 
sus-based style of computing where a different but probably 
equally good answer may result the second time the prob- 
lem is solved. This is a style of computing for areas where 
good (i.e.. within a few percent of optimum) solutions must 
suffice — where there is seldom time to wait for the "best" 



answer. 

In many cases, the data itself serves to determine the 
architecture of the neural network required for a given 
problem, Other cases require the problem to be mapped 
onto structures that may not be obvious at first sight. By 
looking at problems in terms of certain neural data struc- 
tures, we may find neural programming to be quite natural 
and intuitive. 

To develop an intuition for programming with neurons, 
a conceptual model is needed. This model has three layers. 
The innermost layer is the Hopfield neuron. Changing the 
properties of the neuron has a global effect on the problem. 
The second layer is composed of elemental data structures 
suited to the properties of neurons. The third layer is the 
method by which the gap between the data structure and 
the problem statement is bridged. It can be explained and 
observed but, like programming in conventional computer 
languages, it is best practiced. 

Hopfield Neurons 

Forming the core of ourconceptual model is the Hopfield 
neuron (Figs. 2 and 3). Simply stated, it is a nonlinear 




(a) (b) (c) 

Fig. 1. (a) A problem similar to the traveling salesman prob- 
lem — live terminals to be connected together in the shortest 
possible distance with a single strand ol wire (b) The simple 
heuristic approach ol proceeding to the closest unconnected 
node can often yield poor results (c) The optimum result 
needs to consider all the data at once. 
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Fig. 2. Conceptual circuit model 
tor a Hoptield neuron The number 
ot either excitatory or inhibitory in- 
puts is unconstrained, as is the 
positive or negative output swing 
of the summer The gain element 
multiplies the summer's output by 
a constant and then feeds it to a 
nonlinear element where it is com- 
pressed between 0 and 1 



summing device. We can view the Hopfield-type neuron 
as being divided into three sections. 

The first section does an algebraic summation of the 
input signals. These can be both excitatory and inhibitory. 
The excitatory inputs are summed directly and the inhibit- 
ory inputs are first inverted (i.e.. multiplied by - 1) before 
summation. Let us call the output of the summation ele- 
ment x. 

The output of the summation clement is then sent to the 
second section, a gain element, where it is multiplied by 
a constant G. High values of G make the neurons very 
sensitive to small variations of x around zero at the cost 
of reducing the compliancy of the entire system. A very 
high gain causes the neuron to behave as an analog/digital 
circuit known as a comparator. A comparator has an output 
of 1 if the sum of its inputs is the least bit positive. Other- 
wise its output is 0. In a network composed of comparators 
there is no in-between, no compliancy: the network loses 
its ability to compromise. On the other hand, if the gain is 
too low. all of the neurons will be floating somewhere near 
the same value. Like a large party with all the guests talking 
at the same level, no one conversation can be distinguished 
from the others. 

The third section is the output stage, which performs a 
nonlinear transformation on the signal Gx. The relation: 

1 

(1 + e -15 ") 

provides a symmetric sigmoid (i.e.. S-shaped) transfer func- 
tion (Fig. 4). This type of curve is used in photographic 
films and signal compression systems wherea balance must 
be struck between fidelity and wide dynamic range. The 
curve ranges from 0 for large negative values of Gx to 1 for 
large positive values of Gx. When Gx is zero, the output is 
one half. The nonlinear characteristic of the neuron effec- 
tively gives a network of neurons a wider dynamic range 




Fig. 3. Various neuron symbols— the circle being the most 
abstract. Often a circle will be shaded to indicate that a neuron 
is active. The diameter of the circle can also be varied to 
show the relative activity of the neuron 



and a higher degree of compliancy. 
N-Flops 

The n-flop (Fig. 5) represents a natural first stage of or- 
ganization for neurons. We can say "natural" because it is 
easy to construct, and once constructed, it is robust. Being 
robust, it can serve as a base for other organizational struc- 
tures. There is precedence in nature for n-flops in the form 
of neural cells exhibiting lateral inhibition (defined later) 

An n-flop is a local aggregate of n neurons programmed 
by their interconnections to solve the constraint that only 
one of the n will be active when the system is in equilib- 
rium. The term n-flop is derived from flip-flop, a computer 
circuit that has only two stable states. An n-flop has n 
stable states. 

Two independent 6-flops would behave much the same 
as a pair of dice used in games of chance — the final state 
of each should be a random number from 1 to 6. However, 
dice can be "loaded" by moving the center of gravity away 
from the geometric center. The same can be done to a 6-flop 




-5 0 5 

X 

Fig. 4. Higher gain values sharpen the sigmoid curve and 
thus reduce the compliancy of the system At very high gams 
the shape of the curve approaches that of a step function. 
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Fig. 5. Each neuron m the n-llop 
strn/es to suppress the others 
Eventually one wins out K sup- 
plies the energy that the winning 
neuron will use to suppress the 
others 



by applying a small excitation to one of the neurons. There 
is an even greater potential here: the two 6-flops can be 
interlinked so that they always produce the same sum. say 
7. Let the number 1 neuron of the first 6-flop serve as an 
excitatory input for the number 6 neuron of the second 
6-flop. Number 2 goes to number 5 and so on. Then the 
second 6-flop can be wired back to the first in a similar 
manner. The stronger the bias weights (i.e.. excitations), 
the higher the probability that the total outcome will be 7. 

The interconnections for an n-flop are arranged such that 
the output of each neuron is connected to an inhibitory 
illpul of each of the other n - 1 neurons (Figs. 5 and 6). We 
will borrow a term from the biologists and refer to this 
kind of connection as lateral inhibilion. Additionally, there 
is a fixed excitatory input. K. to each of the neurons. Be- 
cause K is excitatory, it will tend to drive each neuron's 
output towards 1. 

Like a pencil balanced on its point, an n-flop also has 
an unstable equilibrium state. In this state all of the neurons 
are balanced somewhere between 0 and 1. Starting from 
this unstable equilibrium state, an n-flop composed of 
physical neurons would eventually become unbalanced 
because of random thermal noise. During this process one 
neuron would begin to predominate (Fig. 7). 

This action would force the other neurons towards zero. 
This in turn would lessen their inhibitory effect on the 
predominant neuron, allowing the excitatory energy of the 
K input to drive its output more towards 1. 

Simulations with n-flops have shown that the lower 
bound on G for a 4-flop is about 4. For a 20-flop it is about 
5 or 6, depending upon K. 

Acceptable values of K for a 4-flop range from about 0.75 
to 1.25. For a 20-flop. the range is roughly 1.25 to 1.75. 




Too much energy from the K input will allow two or more 
neurons to predominate above the others, producing an 
m-of-n-flop. 

The above values are for n-flops that rely solely on lateral 
inhibition to produce the 1-of-n final state. There is an 
additional constraint that could be applied to ensure that 
the summation of all the outputs will be close to some 
value: 1 would be a good choice for the n-flop. Hopfield 
called this constraint global inhibit/on. The global inhibi- 
tion constraint is added to the n-flop by first summing all 
of the outputs of the n neurons. From this sum the desired 
total value is subtracted. The final value is then sent to an 
inhibitory input of each neuron in the n-flup. For example, 
if all of the neurons in an n-flop were initialized to 0. then 
a value of - 1 would be applied to an inhibitory input of 
each neuron. Applying — 1 to an inhibitory input is the 
same as applying + 1 to an excitatory input. The effect is 
to drive the outputs up towards 1 so that the lateral inhi- 
bition factor can then take over to ensure a 1-of-n state. 

The global inhibition mechanism seems to be somewhat 
analogous to an automatic gain control. When in place, it 
allows the n-flop to operate over a wider variation of param- 
eters (Fig. 8), thereby avoiding the m-of-n-flop problem. 

It's easy to connect neurons into a 1-of-n structure and 
then make the aggregate structure work reliably. We might 
even say that this represents a natural form for neurons. 
Neurons connected in this manner represent a more or- 
dered condition. Their stable equilibrium state represents 



to Initial State 




Fig. 6. The heavy line connecting the lour neurons represents 
their mutual inhibitory connections A circle containing the 
number ol states presents a more compact, abstract view 




*3 



t. Final State 



Fig. 7. Snapshots in time ol a 4-llop proceeding trom an 
initial unstable equilibrium state to a final stable equilibrium 
state An n-llop wilh no external biasing inputs will achieve a 
random final state 
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Fig. 8. Parameter sensitivity stud- 
ies lor a 4-llop and an 8-liop. The 
two charts on the left show the op- 
erational region with lateral inhibi- 
tion only. On the right, both lateral 
and global inhibition are used to 
ensure the n-llop's 1-of-n charac- 
teristic The outer dark circle rep- 
resents the relative strength of the 
predominant neuron The inner 
white circle represents the aver- 
age strength ol the nonpredomin- 
ant neurons. Dashes represent 
combinations ol G and K that did 
not result in a valid 1-of-n state 



the resolution of the constraint: "Pick only one of these 
n." This 1-of-n condition can be thought of as a data struc- 
ture — a means to represent the answer to an optimization 
problem. For example, in the four-color map problem, a 
syntactically correct answer is that each country is colored 
only one of four colors. A qualitatively correct answer re- 
quires additionally that each country be a different color 
from each of its neighbors. 

NxN-Flops 

Beyond n-flops there is an even higher degree of order 
that can be achieved. By taking n 2 neurons and arranging 
them into a matrix of n rows and n columns, a new, more 
highly constrained data structure can be formed. We could 
call this an n-by-n-flop (Fig. 9). 

By applying the n-flop connection scheme to both the 
rows and the columns of neurons, we obtain a structure 
that will allow only one neuron to be active in each row 
and only one in each column. This type of data structure 
has n! states (Fig. 10). 

The number of connections increases as n '. Each neuron 
in the matrix is connected to the inhibitory inputs of the 



other n — 1 neurons in its column and similarly to the n — 1 
other neurons of its row. Thus, for this type of network 
there are 2(n - l)n 2 or 2(n 3 -n 2 ) connections. For a 32 x 32- 
flop there are 63,488 connections. 

Programming with Neurons 

Problems best suited for networks of Hopfield neurons 
are those of optimization and constraint resolution. Addi- 
tionally, the solution to the problem should be easily rep- 
resented by a data structure that is compatible with neural 
structures. 

The first step is to construct an array that represents a 
syntactically correct solution to the problem. Often the 
same underlying structure will be found to be common to 
a number of problems. For example, the n-by-n-flop data 
structure is used to represent a syntactically correct answer 
to the traveling salesman problem. 

The next step is to provide biases to the individual 
neurons that represent the characteristics of the problem. 
The biases can be either constants or functions of other 
neurons and serve either as inhibitory or excitatory stimuli. 

The solution realization can be thought of as a dynamic 
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Fig. 9. The lines between neurons represent n-tlop inhibitory 
wiring. Only one neuron can be active in each row and one 
in each column. 

system achieving its lowest energy state. For example, a 
2-flop might be visualized as a ball balanced on a hill 
between two valleys. When released from its unstable 
equilibrium state and subjected to small random perturba- 
tions it will settle in either valley with a 50 percent prob- 
ability. Applying a bias will alter the probability of achiev- 
ing a given final state. 

The Eight Queens Problem 

The eight queens problem is one of simply satisfying 
constraints. The queen is the most powerful chess piece, 
being able to move the entire length of any row, column, 
or diagonal that it might occupy. The challenge is to place 
eight queens on a standard eighl-by-eight chessboard such 
that no queen is attacking any other queen, that is, no two 
queens occupy the same row, column, or diagonal (Fig. 11). 

To solve this problem with artificial neurons, we might 
consider the neural data structures defined earlier. The 
n-by-n-flop has the row/column exclusivity that is needed 
by the problem. An 8-by-8-flop by itself has solved two of 
the three constraints of the problem. All that remains is to 
provide diagonal inhibitory interconnections to each 
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Fig. 1 1 . The eight queens problem — place the eight queens 
on the board so that none is under threat of attack from any 
other. 

neuron in the 8-by-8 array much the same as the row/col- 
umn inhibitory connections. So now, each of the 64 
neurons (each representing a position on the chessboard) 
has the constraints specified in the problem applied to its 
inputs (Fig. 12). 

A solution is obtained by initializing all neurons to some 
near-equilibrium (but unstable, "high-energy") state and 
then releasing the network to seek a stable, "low-energy" 
state (Fig. 13). In practice, the network will sometimes 
become mired in a local minimum and only be able to 
place seven of the eight queens, much the same as its biolog- 
ical counterparts. This susceptibility to local minima may 
stem from the fact that all of the neurons are no longer 
identically programmed. The neurons in the center of the 
board are fed with 27 inhibitory inputs where those on the 
edges are fed with only 21. Perhaps lessening the gain of 
the center neurons in proportion to their additional inputs 
would tend to "flatten out" the local minima on the solu- 
tion surface. Another strategy to avoid local minima is to 




Fig. 10. Two of the 8! possible states lor an 8-by-8-llop 



(a) (b) 

Fig. 12. The row and column constraints are managed by 
the underlying n-by-n-llop data structure (a). Adding diagonal 
inhibition completes the specification of constraints lor the 
eight queens problem (b). 
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Fig. 13. One of 92 possible solutions to the eight queens 
problem. The neurons that have outputs close to 1 are shown 
as shaded circles. These represent queen placements. 

increase the gain of all of (he neurons slowly from a low- 
value to a point that is adequate to allow a solution to 
emerge. This technique has been used by Hopfield to 
achieve superior solutions to the traveling salesman prob- 
lem. Starting at a low gain seems to allow a complex net- 
work to initialize itself close to its unstable equilibrium 
point. 

Egon E. Loebner of Hewlett-Packard Laboratories has 
pointed out that there is often a "knight's jump" relation- 
ship between queen positions in solutions to the eight 
queens problem. Egon has suggested that each neuron in 
the B-by-8 array be allowed to apply an excitatory stimulus 
to its knight's-jump neighborhood. So, instead of being 
totally specified by constraints, there would be a mixture 
of both inhibitory constraints and excitatory stimuli. 

The Four-Color Map Problem 

In the eight queens problem it was possible to construct 
an analog of a chessboard with neurons — one neuron per 
board position. With a problem such as the four-color map 
problem, a similar (but perhaps not so obvious) analog will 
be used. 




Fig. 14. On the left is a generic tourist map On the right is 
the equivalent graph The circles (vertices) represent the 
countries and the lines connecting them (edges) represent 
the adjacencies. 




Fig. 15. The lour -color map problem as a network ol 4-flops. 
The tiny circles represent inhibitory inputs The lines between 
nodes represent eight connections (four going each way) 

The four-color map theorem states that at most four colors 
are required to color any map that can be drawn on a flat 
surface or the surface of a sphere.' There are some special 
cases. A map drawn on a flat surface with straight lines 
that begin and end at edges will require only two colors. 
A map that is drawn on a Mobius strip may require up to 
six colors. One colored on a torus may require up to seven. 

The first step is to transform the problem from the form 
so familiar to all of us — an outline map — to a form more 
familiar to mathematicians — a graph (Fig. 14). In this case, 
the graph does not take the form of the stock market's daily 
highs and lows but resembles more a subway map. In our 
graph, the nodes will represent the countries and the lines 
connecting the nodes will represent common borders be- 
tween countries. Mathematicians refer to the nodes as ver- 
tices and the connecting lines as edges. 

The next step is to place at each node (representing each 
country) a 4-f lop that will indicate which one of four colors 
the country is to be (Fig. 15). The 4-flop satisfies the first 
constraint of the problem: use only one of four colors for 
each node/country. 

The second constraint states that adjoining countries be 




Fig. 16. Mutual inhibitory connections between two 4-flops 
produce a network in which each 4-flop will have a different 
state. At the right is the high-level notation 
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Fig. 1 7. The traveling salesman problem — link all of the cities 
m the shortest possible distance 

of different colors — i.e. connected nodes must be in differ- 
ent states. In the case of two 4-flops connected together, 
this mutual exclusivity can be obtained by connecting the 
output of each neuron to an inhibitory input on its counter- 
part in the other 4-Nop. Thus each line on the graph actually 
represents eight inhibitory connections, four going each 
way (Fig. 16). 

The solution is obtained as in the eight queens problem. 
Place all of the neurons at some intermediate level and 
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Fig. 18. A given path is denoted by a list that indicates the 
order ol visits The list CAE8D is interpreted as starting at 
C. lirst visit A. then E. B. and D, and then return to C Every 
path belongs to a set ol 2n equivalent paths, where n is the 
number ol cities on the path. 



Fig. 19. The state ol the n-by-n-ltop tshown by the shaded 
circles) indicates the path CAEBD The cities are represented 
by the rows ol neurons and the order ol visits is indicated by 
the columns The n-by-n-flop serves as the base data struc- 
ture lor many optimization problems Its characteristic resting 
state allows only n neurons to be on— one per row and one 
per column 

then let them go. 

The Traveling Salesman Problem 

The traveling salesman problem is perhaps the most fa- 
mous of all optimization problems. The problem provides 
us with n cities, each at some distance from the others (Fig. 
17). The objective is to visit each of the n cities once (and 
only once) and then return to the starting city while at the 
same time minimizing the total distance traveled. 

The traveling salesman problem is classified by mathema- 
ticians as being np-complete. which means that the time 
required to find the Optima] path on a conventional com- 
puter will grow exponentially as the number of cities in- 
creases. The number of solutions is n! (similar to the n-by-n- 
flop). The number of distinct solutions is only somewhat 
less. Each distinct path (Fig. 18) has n versions based upon 
which city is the origin. Another factor of two stems from 
the freedom lo start out in either direction from the city of 
origin. Thus there are n!/2n distinct closed paths for the 
traveling salesman problem. 

For a dozen cities there are 121/24 = 19,958,400 distinct 
paths. A modern personal computer could probably search 
them all during a coffee break. However, a problem that is 
not even three times larger (32 cities) contains 321/64 ■ 
4.11x10™ paths. A multiprocessing supercomputer search- 
ing a billion paths a second would require more than 10 17 
years for a complete search. 

Again, the first step is to find a representation for the 
problem solution that is compatible with a network of 
neurons, probably a graph or a matrix. As it turns out, the 
traveling salesman problem can be neatly represented with 
an n-by-n matrix of neurons (Fig. 19). The rows represent 
the n cities and the columns (labeled 1 through n) represent 
the stops along the path. The characteristics of the n-by-n- 
flop (only one active neuron per row and one per column) 
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Fig. 20. Distances between city B and every other city nor- 
malized to city pair AB, the most widely separated city pair 
on the tour. 

ensure a syntactically correct solution — all cities visited 
(one per column) and each city visited only once (one per 
row). 

Before proceeding, a path-length minimization heuristic 
must be selected. A fairly intuitive choice is when at any 
given city, proceed to the closest nonvisited city. Applying 
this heuristic sequentially can lead to results far from op- 
timum. However, application to all neurons simultane- 
ously tends to give results close to optimum. 

To program a particular version of an n-cily traveling 
salesman problem, we need to know the distance between 
every city and every other city. These distances will be 
used to form an additional set of inhibitory constraints. 
Let the longest distance between any two cities be consid- 
ered 1.0, then normalize all of the other distances with 
respect to it (Fig. 20). Thus, the most widely separated city 
pair has a mutual inhibition factor of 1.0, and other city 
pairs have weaker mutual inhibitions proportional to their 
closeness. Each neuron is then connected to the n - 1 
neurons in the column to its left and to the n - 1 neurons 
in the column to its right. The strength of the inhibit signal 
is modulated by the mutual inhibition factor for the particu- 
lar city pair (Fig. 21). The extreme left and right columns 
are considered to be adjacent. 

Let us take the viewpoint of a single neuron (first recall- 
ing that the columns represent the order of the path and 
that the rows represent the cities). We are sitting in one of 
the columns and we see a column of neurons to the left, 
each representing a city that's a possible stop before coming 
to us. A similar column lies to the right. We can think of 
one side as the "coming-from"' column and the other side 
as the "going-to" column, although the direction that we 
perceive as going or coming is not really significant. Now, 
let's assume that we are the strongest neuron in both our 
row and our column. In addition to sending out row and 
column inhibits (to ensure a syntactically valid answer). 



A 



B 



D 




Fig. 21. The n-by-n-flop provides a syntactically correct so- 
lution Adding inputs to each neuron in the form of graded 
mutual inhibits between cities serves to improve the quality 
of the answer Shown are additional inputs — in effect, the 
programming inputs— lor a single neuron, S3. 

we will send out inhibits of varying strengths to both the 
coming-from and going-to columns adjacent to us. The 
strongest inhibits go to those cities farthest from us. and 
the weakest inhibits go to the closest cities. The strength 
of the inhibit is simply the normalized distance fraction 
times the output of our neuron. 

Again, as in the eight queens problem, a solution is ob- 
tained by forcing all of the neurons to some arbitrary initial 



Section 
Line 




Fig. 22. An example of the graph sectioning problem using 
a simple ring network For n = 2, the problem is to bisect 
the network such that each partition has an equal number of 
nodes and a minimum number of line crossings. 
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value. This represents an unstable condition — one that the 
preprogrammed constraints inherent in the n-by-n-flop will 
attempt to rectify'. As the neurons collectively "fall away" 
from their initial values towards values that will satisfy 
the constraints of the n-by-n-flop data structure, the biasing 
network (representing the distances between all of the 
cities) will attempt to "pull" the network towards a final 
state that has the lowest overall residual error. The solution 
will generally not be a perfect one but it will probably be 
a good one. 

The Graph Sectioning Problem 

At first, the graph sectioning problem sounds somewhat 
conjectured — take an arbitrary graph and partition the 
nodes into n sections. The constraints are that each section 
should have an equal (or near equal) number of nodes. 
Additionally, the nodes should be assigned to the sections 
in such a way as to minimize the node connection lines 
crossing between sections (Fig. 22). 

Consider the logic circuitry of a digital computer. The 
individual logic circuits and their interconnections can be 
abstracted as a huge graph. The logic circuits form the 
nodes (or more properly, the vertices) and the input and 
output connections form the graph edges. A typical com- 
puter logic circuit might have three or four inputs and its 
output might be connected to half a dozen or more other 
logic circuits. 

Currently, research is being performed on automatically 
synthesizing the logic circuitry of an entire computer. The 
problem then becomes how to divide 100,000 logic circuits 
into chips, each of which has a finite area for logic circuitry 
and a limited number of input/output pins to accommodate 
connections to other chips. 

Like the four-color map problem, the graph sectioning 
problem will require an n-flop at each node of the graph. 
The number of sections is represented by n. It is interesting 
to note how well-prepared the problem is to solve. Simply 




v y 



Fig. 23. Excitatory connections try to keep adjacent nodes 
in the same section. Global balance constraints keep all 
nodes Irom being put into the same section. 



replace each logic circuit by an n-flop and use the existing 
interconnections to connect the n-flops. This connection 
will be an excitatory connection. Once a node is assigned 
a section, assigning all connected nodes to the same section 
will tend to minimize the connections crossing between 
sections. 

To ensure even distribution among the sections, a global 
constraint term is required. As the solution emerges, the 
sum of nodes assigned to each section is monitored. If an 
imbalance occurs, an excitatory bias is applied to encourage 
assignment to the deficient sections. At the same time an 
inhibitory bias is applied to discourage assignment to a 
section that has a surplus (Fig. 23). 

The solution is obtained much the same as in the other 
problems. First the neurons are placed at an arbitrary initial 
value and then the network is released from its initial con- 
strained state. Immediately, the underlying data structure 
constraints will seek a global compromise with the biasing 
(i.e., programming) constraints. The result will generally 
be a good solution but probably not a perfect one. 

Why can't perfect solutions be readily obtained? Con- 
sider the following rationalization. A structure such as an 
n-flop is a very stable entity when it is in its final equilib- 
rium state; the energy required to change its state is consid- 
erable. On the other hand, just after the network is released 
from its initial state, it is very susceptible to the effects of 
the biasing network. This is because the effect of the data 
structure constraints does not become evident until an im- 
balance begins to occur. 

At the point of initial release the output of the biasing 
network is at its strongest. Its effect will be to "steer" the 
network towards a state that will produce a lesser error. 
As the network nears a state that tends to satisfy the biasing 
constraints, the total energy available from the biasing net- 
work will diminish. At some point, the effect of the under- 
lying data structure will begin to predominate and sub- 
sequently pull the network to its final equilibrium state. 
So, at the final equilibrium state there will probably still 
be some small correctional signal from the biasing network 
(representing a less than perfect solution) but its strength 
will be small compared to that required to shift the network 
from its resting state. 

Summary 

Simple networks of nonlinear summing devices have 
demonstrated the collective property of being able to re- 
solve elementary constraints. By viewing these networks 
as neural data structures, more complex networks can be 
easily conceptualized. These highly interconnected net- 
works are able to find near-optimal solutions to difficult 
np-complete optimization problems such as the traveling 
salesman problem. 

The method of matching a solution network to a problem 
is twofold. First, a network must be realized that yields a 
syntactically correct answer. Then additional constraints 
or programming biases relating to the problem are added 
to the network. These additional inputs serve to select qual- 
itatively good answers from the set of syntactically correct 
ones. 



JUNE 1989 HEWLETT-PACKARD JOURNAL 77 



© Copr. 1949-1998 Hewlett-Packard Co. 



Reference 

1. K. Appel and W. Haken. "The Solution of the Four-Color-Map 
Problem," Scientific American, Vol. 237, October 1977. pp. 108- 
121. 

Bibliography 

1. J.|. Hopfield and D.W. Tank. "'Neural' Compulation of Deci- 
sions in Optimization Problems," Biological Cybernetics, Vol. 52. 
1985, pp. 141-152. 

2. J.J. Hopfield and D.W. Tank, "Computing with Neural Circuits: 
A Model." Science, Vol. 233, no. 4764, August 8, 1986, pp. 625- 
633. 

3. D.W. Tank and J.J. Hopfield, "Collective Computation in 
Neuronlike Circuits," Scientific American, Vol. 257, no. ti. De- 
cember 1987, pp. 104-114. 

4. C. Peterson and J.R. Anderson, Neural Networks and NP-Com- 
plete Optimization Problems.- A Performance Study on the Graph 
Bisection Problem, MCC Technical Report No. EI-287-87, 
December 14, 1987. 



CORRECTION 
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A New 2D Simulation Model of 
Electromigration 

Electromigration in miniature IC interconnect lines is 
simulated in HP's sophisticated two-dimensional model, 
giving new quantitative and graphical insights into one of 
the most important metallization failure sources for VLSI 
chips. 

by Paul J. Marcoux. Paul P. Merchant, Vladimir Naroditsky, and Wulf D. Rehder 



WHEN THIN METAL FILMS, such as the intercon- 
nect lines of integrated circuits, are stressed by 
high current densities, a slow migration of atoms 
is induced, which for aluminum and its alloys proceeds 
in the direction from cathode to anode. It can be inferred 
from transmission electron microscopy (TEM), scanning 
electron micrographs (SEM), and diffusion studies that 
these atoms travel predominantly along grain boundaries. 
If structural inhomogeneities develop in the conductor, for 
example at material interfaces or at triple points or at grain 
size divergences, then the current-induced atom flow is 
nonuniform. As a consequence, there exists a nonzero di- 
vergence between the incoming and the outgoing flux at 
these locations, so that material piles up and vacancies 
form. While such accumulation of material (sometimes 
called hillocks or, in special cases, whiskers) may short 
adjacent conductors, the vacancies, on the other hand, will 
deteriorate the line until voids have coalesced sufficiently 
to form cracks that eventually lead to electrical opens. 

Over the past 20 years, this complex phenomenon, 
known as electromigration, has become a subject of increas- 
ing concern for the entire chip industry because of its del- 
eterious effect on IC reliability. It is especially troublesome 
now. in light of the continuing shrinkage of IC dimensions 
below the one-micrometer level. 

Hundreds of papers have been written about electromi- 
gration and special task forces have been established in- 
volving the main chip makers worldwide, but a detailed 
theoretical understanding of this phenomenon is still in 
its early stages. Two main approaches have evolved. The 
first and earlier method saw researchers test their partial 
theories by deriving analytical formulas from plausible 
physical assumptions, with computational results that can 
be tested against the substantial body of empirical data. 
Two of the most prominent analytic expressions are Hunt- 
ington's formula' for the atomic flux (see equation 7 below), 
and Black's semiempirical equation (see equation 8 below) 
for the median time to failure. 2 

The second approach starts from basic physical princi- 
ples and established general laws such as the continuity 
equation and diffusion laws, and uses these to drive simu- 
lation models built to mimic the dynamic sequence of 
events in an interconnect line through a time-dependent 
Monte Carlo method. Early simulation models are those of 



Attardo 3 and N'ikawa. 4 

This paper gives an outline of a new 2D simulation model 
for electromigration. which is the result of a four-year col- 
laboration of HP's Integrated Circuits Laboratory and the 
Center for Applied Mathematics and Computer Science at 
California State University at San Jose. 

Classical Methods Used 

Some quantum mechanical approaches purport to be able 
to treat field and current effects and the so-called direct 
and wind forces in a self-consistent manner. However, the 
practical gain of these more intricate models over classical 
models appears limited, for two reasons. First, the quantum 
theoretical formulas often coincide, at least for the special 
and manageable cases of interest, with the classical expres- 
sions (see. for example equation 25 in reference 5). Second, 
there are still fundamental conceptual difficulties as to the 
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Fig. 1. Flowchart ol the basic algorithm lor the simulation 
program 
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nature of these underlying forces at the sites of Impurities 
(see the controversy reported in reference fi and the recent 
paper by Verbruggen 7 ). Faced with these theoretical dif- 
ficulties, but also encouraged by the considerable data val- 
idation of Huntington's classical formula and restrained 
by considerations of simplicity and computer adaptability 
(codeability), our team took a pragmatic approach and kept 
all model assumptions, all mathematical techniques, and 
all underlying physical principles entirely classical, neg- 
lecting quantum effects. This limitation led us to the adop- 
tion of the following model hypotheses about the two forces 
that are ultimately responsible for electromigration in met- 
als such as aluminum. 

The electrostatic direct force caused by the applied exter- 
nal field acts on the metal ions. The wind force associated 
with the electron current that accompanies the electric 
field, which is sometimes called electron drag or electron 
friction, is a consequence of the momentum transfer from 
the electrons to the ions during collisions. In aluminum 
alloys, this latter scattering term dominates the electrostatic 
force. Hence the resulting migration is towards the anode. 

A major feature of the new HP model is that only one 
generic type of equation, the potential equation, describes 
both the heat development (Helmholtz equation) and the 
current flow (Laplace equation). The potential equation is 
a two-dimensional partial differential equation, allowing 
continuous or discontinuous step functions as coefficients. 
A finite element method in two dimensions applied to this 
potential equation renders a large system of linear equa- 
tions with a sparse coefficient matrix for both cases. A 
Fortran subroutine was written to solve this system for 
temperature and current." 

By means of another crucial model building block, the 
continuity equation, we keep track of the material move- 
ment resulting from the inhomogeneities mentioned above. 
Material flow in polycrystalline films at temperatures 
below about two thirds of the melting temperature of the 
metal occurs predominantly through grain boundaries. 




Thus, we consider material flow only through the grain 
boundary network. 

Model Overview 

The flowchart of the algorithm is shown in Fig. 1. 
Grain Structure Generation. First, using a Monte Carlo 
technique, we generate a two-dimensional geometrical pat- 
tern that simulates the grain structure of a thin metal film 
by a plane dissection method called Voronoi tesselation. 
A Voronoi tesselation in two dimensions is a partition of 
the plane into cells bounded by polygons. Cells are defined 
by seeds, which are randomly distributed using a Poisson 
distribution, and the bounding polygon is the set of points 
that has equal distance from the cell's seed as well as from 
the seeds of neighboring cells. These cells then represent 
a carpet of metal grains from which we then clip our metal 
interconnect line (Fig. 2). 

Our package calculates the following statistical charac- 
teristics of the grain structure thus simulated: 
■ The area distribution of the grain sizes 
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Fig. 2. Screen showing ihe interactive clipping process, in 
which individual lines that will be subjected to simulated stress 
are cut Irom a large area. This operation mimics the lithog- 
raphy and etching steps in the formation ol real test structures 



Fig. 3. (a) Distributions of grain areas (top) and average 
diameters (bottom) (b) Distributions of lengths of grain boun- 
dary segments (top) and the number of vertices per grain 
(bottom) 
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■ The diameter distribution of the grains 

■ The distribution of segment length 

■ The number of triple points 

■ The number of vertices. 

Figures 3a and 3b show typicaJ histograms for areas, 
diameters, segment length, and number of vertices for the 
grain structure shown in Fig. 2. These distributions are 
characteristic of real deposited films. Thus, our model is 
useful in studying the correlation between failure distribu- 
tions and deposited film grain structures. 
Current Flow. The main advantage of the HP model over 
others like Nikawa*s 4 is that both the current flow and the 
heat equation are truly two-dimensional (as is the genera- 
tion of the grain structure just described). The steady-state 
behavior of the current flow is described by the Laplace 
equation 

(ku,)« - (ku v ), = 0. (1) 

where simple discontinuities in the electrical conductivity 
k = k(x.y) are allowed. The value k = 0 is assigned to those 
cells in the finite element grid where no current is being 
conducted because of cracks. In electrically active cells, 
on the other hand, the function k assigns a constant value 
K > 0. 

Once the Fortran solver subroutine has solved this Lap- 
lace equation for the potential u = u(x.y), we obtain a dis- 
crete approximation for the current density j as follows. In 
the defining equation for j, 

j = - kgrad(u) (2) 

substitute forgrad(u) the finite differences of the first order, 



utf + l.j) - u(i.j) 
and (3) 
u(i.j + l) - u(i.j). 

Temperature Distribution. Current flow in the intercon- 
nects leads to Joule heating, which is dissipated to the 
surrounding environment. In our model we consider the 
heat flow through a two-dimensional rectangle in the finite 
element grid and derive the following partial differential 
equation for the temperature function T = T(x.y) in the 
metal line: 

(rTJ» + ( TT y ) y + j'pod + «*T) = 0. (4) 

Here t = Tfx.y) is the thermal conductivity coefficient. p„ 
is the resistivity of the metal at a reference temperature T 
= T 0 , u denotes the temperature coefficient of resistance, 
and j is the absolute value of the current density. Because 
the conducting line on an elec Immigration test chip ends 
on either side in relatively large bonding pads or at contacts 
to the substrate, which are at an ambient temperature T = 
T a . the boundary conditions are also well-defined. 

It is clear that a numeric procedure to solve equation 4 
will also solve the more specialized Laplace equation 1, 
since formally putting p 0 = 0 transforms equation 4 Into 
equation I, where the temperature function T is replaced 
by the potential u, and the electrical conductivity k assumes 
the role of the thermal conductivity t. This observation 
makes it possible for one Fortran subroutine to solve both 
equations. 

Atomic Flux. In addition to temperature gradients and cur- 
rent density changes, there are other parameters influenc- 
ing the flux of aloms through the grain boundary network 





Fig. 4. Screen showing void for- 
mation in a (est structure alter cur- 
rent stressing Cells containing a 
voided gram boundary are high- 
lighted in blue-green. The grid 
size is larger than that normally 
used in simulations to make the 
voiding process visible Electron 
How is Irom lelt to right 
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of the metal line. The (wo most important geometrical pa- 
rameters are 0 and t/r, where the angle 0 denotes the mis- 
orientation of adjacent grains, and the angle <li denotes the 
angle between the grain boundary and the current flow 
vector. Their values define, together with the mobility con- 
stant A, the preexponential term D,, b : 

D D |, = Asin(0/2)cos(i/<) (5) 

for the grain boundary diffusion term 

D = D I)b exp(E a /kT) |6) 

where E„ is the activation energy and k is the Boltzmann 
constant. In our model, 0 and i// are assigned to each grain 
boundary segment by a Monte Carlo process. The diffusion 
term D then enters the Huntington formula for the atomic: 
flux J,: 

J a = DN h Z D e/kTp(j - j c ). (71 




Current Density j (10 6 A cm 2 ) 

Fig. 5. Plot ot the failure time of a single line structure that 
has been subjected to simulated stresses at various current 
densities under conditions ol normal Joule heating (solid) and 
fixed temperature (broken). H is the film-to-substrate heat 
transfer coefficient A large value of H ensures a constant 
temperature. The value of 0 75 nWtnm-°C is typical of real 
systems. The quantity r 2 is the regression coefficient and n 
is the slope from the least squares fit. 
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Here p = p„(l + a(T — T„)) is the resistivity at temper- 
ature T, Z*,e is the term for the effective charge, the material 
constant N„ denotes the atom density [e.g.. for aluminum) 
in grain boundaries, and j L is a stress-induced countercur- 
rent density term which in our model (see also reference 
4) depends on temperature. 

Crack Growth and Line Failure. The program tracks the 
change in the number of atoms coming into and flowing 
out of each triple point. Cracks start growing at these triple 
points as soon as the atom count in the transporting grain 
boundaries shows a density of incoming atoms below a 
certain threshold ( Fig. 4). Hence, if the material concentra- 
tion drops below this level in a particular area, then this 
location ceases to be electrically conducting. In a similar 
vein, mass accumulation above a certain value creates hill- 
ocks. It happens occasionally in this model (as it does in 
reality!) that small cracks recover and fill in again, and 
some hillocks decrease or vanish completely. 

Normally, the program is run until the simulated line 
fails. Failure is determined by calculating the gradient of 
the potential along vertical grid lines that are superimposed 
over the metal film (and used for the discretization proce- 
dure necessary for the finite element method). It this gra- 
dient is zero for two adjacent grid lines we know that no 
current is flowing (which signifies an open circuit) and the 
metal conductor has failed. 

Creating several independent metal stripes (by clipping 
them out of the simulated grain structure) and subjecting 
them sequentially to the same accelerated test conditions 
provides a sample of failure times whose statistical distri- 
bution can be plotted and studied. 

In the remainder of this paper we are concerned with 
determining how the new HP simulation package provides 
an accurate representation of the real-world situation of 
accelerated electromigration testing. There is a wealth of 
published experimental data available, and we can address 
here only a few of the more critical experiments. 

Time to Failure versus Current Density 

One of the early models, proposed on empirical grounds 
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Fig. 6. Cumulative distribution ot the logarithms ol simulated 
failure times. Units on the abscissa are standard deviations 
from the median (0). A straight line represents a lognormal 
failure distribution 
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by Black 2 and later analytically derived (under certain con- 
ditions) by Chhabra and Ainslie. 9 states a direct relation- 
ship between the median time to failure (MTTF) t 5(1 . the 
current density j. and the absolute temperature T. A 
simplified version of this relationship is given by the fol- 
lowing Arrhenius-type equation: 

tso = BfexpfEAT). (•) 

^iere B is a proportionality constant and k denotes the 
^ ::ian constant. 

^Wuch attention has been given in the literature to the 
exponent - n, which on a logarithmic plot of ln(t so ) versus 
j can be interpreted as the slope of a straight line. With 
this model we can fix the thermal boundary conditions to 
hold the metal line temperature constant. A comparison 
of failure times of the same metal line structure at several 
current densities under conditions of fixed temperature 
and |oule heating is shown in Fig. 5. This result shows 
that Joule heating can account for the curvature of the solid 
line plot. Under the assumptions of our model there are 
no other sources of curvature. 

Notice that because of the high value of the current den- 
sity j in VLSI devices, even small changes in the exponent 
n have a major impact on the MTTF t 5 „ if extrapolations 
are made from high (test) to lower (operating) current den- 
sities. ,n " 1213 Thus it is important to have an understand- 
ing of the origin of the behavior. 

Time to Failure versus Linewidth 

An interesting feature of the linewidth dependence of 
the failure lime is the so-called bamboo effect, which may 
be described as follows: if the width of an interconnect 
line decreases from several (median) grain diameters down 
to about one grain diameter, the failure time decreases 
linearly. However, a further decrease in linewidth results 
In far longer MTTFs. that is. most lines survive longer. 

We have observed that the standard deviation of the fail- 
ure time follows the same trend, implying that very narrow 
aluminum lines, while living longer, may have the serious 
drawback of unpredictably large variations as measured by 
the standard deviation. As a consequence, the quality of a 
sample of such thin stripes would vary widely. 

In the framework of the structural model presented here, 
the bamboo effect can be explained rather easily. If the 
width drops below the size of a single typical metal grain, 
hardly any triple points remain, and the thin line looks 
like a sequence of grains stacked together similar to the 
segments of a bamboo stick. That some residual electromi- 
gration still occurs is usually attributed to surface diffusion, 
but data for this phenomenon in aluminum films is pres- 
ently lacking. 

Results from simulations further suggest that larger vari- 
ations in the grain size distribution trigger the bamboo 
effect somewhat earlier than for a more homogeneous size 
distribution. An exact relationship is not yet known. 

Distribution of the Failure Time 

When talking about failure times of thin metal lines be- 
cause of electromigration. we must be aware that the times 
to failure are measured under accelerated conditions, no- 



tably for current densities and ambient (oven) temperatures 
much higher than encountered under real-life operating 
conditions. Hence, even if we succeed in determining the 
distribution of the time to failure, there still remains the 
problem of extrapolating this accelerated distribution to 
an actual operating environment. This recalculation can 
be solved with the help of Goldthwaite plots. 14 or by direct 
computer calculations, once the failure time density f(t) 
and the cumulative fail-time distribution function F(t) are 
estimated. From these the failure rate function MO can be 
calculated: 

\(t) = f(t)/(l - F(t)). (9) 

Under accelerated conditions almost all samples of simu- 
lated failure times showed a good eye-fit to straight lines 
on lognormal paper, suggesting a lognornial distribution 
of failure times. The cumulative plot of the logarithm of 
simulated failure times of 50 lines is shown in Fig. 6. 

However, other two-parameter distributions like the 
Weibull or the gamma family can be fitted fairly well, at 
least over certain ranges of failure rates. Attardo 3 notes that 
"for example, life test data on aluminum conductors fit the 
lognormal and Weibull distribution equally well at rela- 
tively large percentages of failure (0.1 to 1.0%). but at lower 
percentages of failure — that is. within the region of in- 
terest — the projected failure rates differ by several orders 
of magnitude." 

Empirical evidence seems to indicate that electromigra- 
tion failure times can be described better by the the more 
optimistic lognormal than by the Weibull distribution. 15 
Relatively small sample sizes, large margins of measure- 
ment errors and the normal variability because of the ran- 
domness of the grain structure, together with the lack of a 
deeper understanding of the underlying electromigration 
forces, preclude at this point a definite decision about t he- 
true distribution of the time to failure. However, simula- 
tions of larger numbers of samples are more convenient to 
perform than costly life tests. 

Summary 

To date, we have used this simulation tool to verify the 
origins of the bamboo effect for electromigration and the 
curvature of plots of lifetime versus current density. The 
model can also reproduce the quantitative effects seen in 
Arrhenius plots of lifetime versus temperature and exhibits 
failure distributions representative of actual life tests of 
metal lines under current and temperature stress. Future 
efforts will be directed at better understanding the correla- 
tion between grain structures and metal film properties 
and the resultant failure distributions for various stress 
conditions. 
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